Hi,
I use "lvm2 on software raid" on an opteron fileserver. The machine is
running the amd64 port but I don't think that is part of the problem. I
am simply wondering if anyone else has noticed that since it sounds
like a bug to me (as in bug vs feature).
Here is the setup:
kernel: Linux dax 2.6.15.4 (custom built)
> cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[0] sdc1[2](S) sdb1[1]
96256 blocks [2/2] [UU]
md1 : active raid1 sda2[0] sdc2[2](S) sdb2[1]
156143680 blocks [2/2] [UU]
md0 is the boot partition while md1 is the only lvm physical volume. sdc
is only used as a spare disk. sdd is a unit from an hardware raid controller,
it does not matter for the current problem.
The odd thing looks like that:
> iostat 1
...
avg-cpu: %user %nice %sys %iowait %idle
0.00 0.00 0.00 0.00 100.00
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 3.00 0.00 144.00 0 144
sdb 3.00 0.00 144.00 0 144
sdc 2.00 0.00 16.00 0 16
md1 1.00 0.00 128.00 0 128
sdd 7.00 0.00 80.00 0 80
md0 0.00 0.00 0.00 0 0
...
As you can see there is a write to the spare disk (sdc). That is not
much, so I guess that this is only the event counter being updated. But
in my opinion, this is a problem.
Why is this a problem? In this system, I do not expect a continous use
of the system drives, therefore I will see a start/stop cycle a few
times a day. This is fine but I bet that the motor will be the first
thing to fail on those drive, which is why I would like the hot spare to
be there. Now, if an event counter is incremented, the spare will also
spin up/spin down, and so likely break around the same time as the
others. My problem is then that I don't understand why a event counter
is needed on a spare disk, it is not supposed to be used unless
something goes wrong... Did I miss something?
thanks
jacques
PS: I know that a quick work around would be to completely disable the
spin down timer on every drive, but that is only hidding the dust below
the carpet.
Attachment:
signature.asc
Description: Digital signature