[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

File read speed *strongly* degraded on one partition only



Hi.  I am really befuddled with this one.  I've checked the archives
of this list for anyone reporting anything similar, I've googled, and
haven't found anything useful.

Yesterday, I abruptly started suffering from extremely slow disk reads
from my /home partition.  The symptoms are truly odd:  not only have the
parameters of the drive (checked through hdparm) not changed -- still
using dma etc. -- but the problem isn't present on all the partitions
of that drive.  If I copy a large file from /home to a different partition
on the *same drive* (which takes forever, because the read of the original
on /home is very slow), and then read from the new copy of the file on the
other partition, it reads about 15-16 times more quickly.  The drive is not
slow; only that partition is.

Can anyone point me at anything that would help me figure out what's going
on, and how to recover my lost I/O speed?  What am I missing here as to why
this would occur?

The gory details:

This machine has one WD800JB drive (80GB, 7200rpm, 8MB cache) partitioned
with swap, two primary partitions (/boot and /) and six logical partitions,
one of which is /home.  The machine also has four WD1200JB drives -- the
same drive, but at 120GB instead -- which will eventually end up as a pair
of RAID1s.  Right now, while screwing around with this, three of the drives
have one big partition while the fourth has a 1GB FAT partition with the
rest as free space.  Except for the swap and the FAT partitions, all the
rest are ReiserFS.  The layout:

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/hde3               409636     82140    327496  21% /
/dev/hde1               104380     36188     68192  35% /boot
/dev/hde10            56563100  49769796   6793304  88% /home
/dev/hde5               104380     34436     69944  33% /tmp
/dev/hde6              8385636   3322468   5063168  40% /usr
/dev/hde8              2096380    736880   1359500  36% /usr/local
/dev/hde9              4192800    914552   3278248  22% /usr/src
/dev/hde7              4192800    278508   3914292   7% /var
/dev/hdb1            117214656    698832 116515824   1% /mnt/b1
/dev/hdd1            117183440    698832 116484608   1% /mnt/d1
/dev/hdf1            117217208    698852 116518356   1% /mnt/f1
/dev/hdg1               995728    680640    315088  69% /mnt/g1  (FAT)

/dev/hda and /dev/hdc are cd-rw and dvdrom respectively.  The additional
four drives come from a Promise RAID controller on-motherboard with its
RAID capabilties off; it's only being used to provide additional IDE
channels.

Using hdparm to compare the drives results in effectively the same
numbers across the board -- "-tT" produces the same results regardless
of drive, typically something like:

/dev/hde:
 Timing buffer-cache reads:   128 MB in  0.41 seconds =312.20 MB/sec
 Timing buffered disk reads:  64 MB in  1.38 seconds = 46.38 MB/sec

while "-c -d -m -I" show the same configuration regardless
of drive.  In particular, they all have DMA on.

So everything looks normal, the same on all the drives, until the file
manipulation starts.  A 651 MB file is copied from its location on /home
to the first 120 GB drive, /mnt/b1; the two files are then compared to
each other.  Using "time", the copy takes 250s while the compare takes
255s -- more than 4 minutes.

Now, copying the file from /mnt/b1 to /mnt/d1 -- that is, from the
first to the second 120GB drive -- takes only 16 seconds, slightly
slower than the buffered disk read speed above; the compare
also takes about that, on average (done six times).  Similarly,
copying and comparing from /mnt/b1 to /mnt/f1 or /mnt/g1 show
similar times.  But again, comparing the original copy on /home
to the copies on any of the other four disks results in times well
above 4 minutes -- or 15-16 times slower.

This smells like there's just something wrong with that disk --
except, as noted, hdparm shows the same configuration and reads
from the raw device /dev/hde show the same times as the other disks.
And no errors are being written to any logfiles, either.  So, out
of curiosity, I copied and compared between the first 120 Gig disk,
/mnt/b1, and a different extended partition on the same drive as
/home.  The result:  fast reads, with times similar to those
copying/comparing between the 120GB drives.  I checked another one
of the extended partitions and got the same result.  It's not the
drive that's slow -- it's only the partition holding /home that's
slow . . .16 times as slow.

Furthermore, it seems to depend on *where* in the partition.  As
a final test, I copied and compared the ISO from /mnt/b1 onto /home
again, but in a different location.  The result?  Slow -- 60 seconds
--  but not *as* slow as copies from and compares against the original.

So to summarize, disk reads on my /home partition are a factor of 4 to
16 slower than reads from other partitions on that same disk, or other
disks.  The same file can generate different (slow) read speeds depending
on where it's located in the partition.  No disk errors are being logged.
This resembles disk fragmentation as I used to encounter it on FAT
filesystems; but it was my (perhaps faulty) understanding that Linux
FSes weren't so susceptable to that sort of thing.

Any advice on how I can figure out what's going on, and how to fix it,
would be greatly appreciated.

Thanks muchly.


-- 
Chris Metzler			cmetzler@speakeasy.snip-me.net
		(remove "snip-me." to email)

"As a child I understood how to give; I have forgotten this grace since I
have become civilized." - Chief Luther Standing Bear



Reply to: