On Thu, Oct 30, 2008 at 02:04:36PM +1030, Brendan Hart wrote:
> On Thu 30/10/2008 12:25 PM, Jeremy Chadwick wrote:
> >> Could the "missing" space be an indication of hardware disk issues i.e.
> >> physical blocks marked as bad?
> >The simple answer is no, bad blocks would not cause what you're seeing.
> >smartctl -a /dev/disk will help you determine if there's evidence the disk
> is in bad shape. I can help you with reading SMART stats if need be.
> I took a look at using the smart tools as you suggested, but have now found
> that the disk in question is a RAID1 set on a DELL PERC 3/Di controller and
> smartctl does not appear to be the correct tool to access the SMART data for
> the individual disks. After a little research, I have found the aaccli tool
> and used it to get the following information:
Sadly, that controller does not show you SMART attributes. This is one
of the biggest problems with the majority (but not all) of hardware RAID
controllers -- they give you no access to disk-level things like SMART.
FreeBSD has support for such (using CAM's pass(4)), but the driver has
to support/use it, *and* the card firmware has to support it. At
present, Areca, 3Ware, and Promise controllers support such; HighPoint
might, but I haven't confirmed it. Adaptec does not.
What you showed tells me nothing about SMART, other than the remote
possibility its basing some of its decisions on the "general SMART
health status", which means jack squat. I can explain why this is if
need be, but it's not related to the problem you're having.
Either way, this is just one of many reasons to avoid hardware RAID
controllers if given the choice.
> AAC0> disk show defects 00
> Executing: disk show defects (ID=0)
> Number of PRIMARY defects on drive: 285
> Number of GROWN defects on drive: 0
> AAC0> disk show defects 01
> Executing: disk show defects (ID=1)
> Number of PRIMARY defects on drive: 193
> Number of GROWN defects on drive: 0
> This output doesn't seem to indicate existing physical issues on the disks.
I hope these are SCSI disks you're showing here, otherwise I'm not sure
how the controller is able to get the primary defect count of a SATA or
SAS disk. So, assuming the numbers shown are accurate, then yes, I
don't think there's any disk-level problem.
> I have done some additional digging and noticed that there is a /usr/.snap
> folder present. "ls -al" shows no content however. Some quick searching
> shows this could possibly be part of a UFS snapshot...
Correct; the .snap directory is used for UFS2 snapshots and
mksnap_ffs(8) (which is also the program dump -L uses).
> I wonder if partition snapshots might be the cause of my major disk
> space "loss".
Your /usr/.snap directory is empty; there are no snapshots. That said,
are you actually making filesystem snapshots using dump or mksnap_ffs?
If not, then you're barking up the wrong tree. :-)
> I also took a look to see if the issue could be something like running out
> of inodes, But this does't seem to be the case:
> #: df -ih /usr
> Filesystem Size Used Avail Capacity iused ifree %iused Mounted
> /dev/aacd0s1f 28G 25G 1.1G 96% 708181 3107241 19% /usr
inodes != disk space, but I'm pretty sure you know that.
I understand at this point you're running around with your arms in the
air, but you've already confirmed one thing: none of your other systems
exhibit this problem. If this is a production environment, step back a
moment and ask yourself: "just how much time is this worth?" It might
be better to just newfs the filesystem and be done with it, especially
if this is a one-time-never-seen-before thing.
> I will wait and see if any other list member has any suggestions for me to
> try, but I am now leaning toward scrubbing the system. Oh well.
When you say scrubbing, are you referring to actually formatting/wiping
the system, or are you referring to disk scrubbing?
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. PGP: 4BD6C0CB |