File read speed *strongly* degraded on one partition only
Hi. I am really befuddled with this one. I've checked the archives
of this list for anyone reporting anything similar, I've googled, and
haven't found anything useful.
Yesterday, I abruptly started suffering from extremely slow disk reads
from my /home partition. The symptoms are truly odd: not only have the
parameters of the drive (checked through hdparm) not changed -- still
using dma etc. -- but the problem isn't present on all the partitions
of that drive. If I copy a large file from /home to a different partition
on the *same drive* (which takes forever, because the read of the original
on /home is very slow), and then read from the new copy of the file on the
other partition, it reads about 15-16 times more quickly. The drive is not
slow; only that partition is.
Can anyone point me at anything that would help me figure out what's going
on, and how to recover my lost I/O speed? What am I missing here as to why
this would occur?
The gory details:
This machine has one WD800JB drive (80GB, 7200rpm, 8MB cache) partitioned
with swap, two primary partitions (/boot and /) and six logical partitions,
one of which is /home. The machine also has four WD1200JB drives -- the
same drive, but at 120GB instead -- which will eventually end up as a pair
of RAID1s. Right now, while screwing around with this, three of the drives
have one big partition while the fourth has a 1GB FAT partition with the
rest as free space. Except for the swap and the FAT partitions, all the
rest are ReiserFS. The layout:
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/hde3 409636 82140 327496 21% /
/dev/hde1 104380 36188 68192 35% /boot
/dev/hde10 56563100 49769796 6793304 88% /home
/dev/hde5 104380 34436 69944 33% /tmp
/dev/hde6 8385636 3322468 5063168 40% /usr
/dev/hde8 2096380 736880 1359500 36% /usr/local
/dev/hde9 4192800 914552 3278248 22% /usr/src
/dev/hde7 4192800 278508 3914292 7% /var
/dev/hdb1 117214656 698832 116515824 1% /mnt/b1
/dev/hdd1 117183440 698832 116484608 1% /mnt/d1
/dev/hdf1 117217208 698852 116518356 1% /mnt/f1
/dev/hdg1 995728 680640 315088 69% /mnt/g1 (FAT)
/dev/hda and /dev/hdc are cd-rw and dvdrom respectively. The additional
four drives come from a Promise RAID controller on-motherboard with its
RAID capabilties off; it's only being used to provide additional IDE
Using hdparm to compare the drives results in effectively the same
numbers across the board -- "-tT" produces the same results regardless
of drive, typically something like:
Timing buffer-cache reads: 128 MB in 0.41 seconds =312.20 MB/sec
Timing buffered disk reads: 64 MB in 1.38 seconds = 46.38 MB/sec
while "-c -d -m -I" show the same configuration regardless
of drive. In particular, they all have DMA on.
So everything looks normal, the same on all the drives, until the file
manipulation starts. A 651 MB file is copied from its location on /home
to the first 120 GB drive, /mnt/b1; the two files are then compared to
each other. Using "time", the copy takes 250s while the compare takes
255s -- more than 4 minutes.
Now, copying the file from /mnt/b1 to /mnt/d1 -- that is, from the
first to the second 120GB drive -- takes only 16 seconds, slightly
slower than the buffered disk read speed above; the compare
also takes about that, on average (done six times). Similarly,
copying and comparing from /mnt/b1 to /mnt/f1 or /mnt/g1 show
similar times. But again, comparing the original copy on /home
to the copies on any of the other four disks results in times well
above 4 minutes -- or 15-16 times slower.
This smells like there's just something wrong with that disk --
except, as noted, hdparm shows the same configuration and reads
from the raw device /dev/hde show the same times as the other disks.
And no errors are being written to any logfiles, either. So, out
of curiosity, I copied and compared between the first 120 Gig disk,
/mnt/b1, and a different extended partition on the same drive as
/home. The result: fast reads, with times similar to those
copying/comparing between the 120GB drives. I checked another one
of the extended partitions and got the same result. It's not the
drive that's slow -- it's only the partition holding /home that's
slow . . .16 times as slow.
Furthermore, it seems to depend on *where* in the partition. As
a final test, I copied and compared the ISO from /mnt/b1 onto /home
again, but in a different location. The result? Slow -- 60 seconds
-- but not *as* slow as copies from and compares against the original.
So to summarize, disk reads on my /home partition are a factor of 4 to
16 slower than reads from other partitions on that same disk, or other
disks. The same file can generate different (slow) read speeds depending
on where it's located in the partition. No disk errors are being logged.
This resembles disk fragmentation as I used to encounter it on FAT
filesystems; but it was my (perhaps faulty) understanding that Linux
FSes weren't so susceptable to that sort of thing.
Any advice on how I can figure out what's going on, and how to fix it,
would be greatly appreciated.
Chris Metzler firstname.lastname@example.org
(remove "snip-me." to email)
"As a child I understood how to give; I have forgotten this grace since I
have become civilized." - Chief Luther Standing Bear