[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

RE: RAID1 problem - server freezes on md data-check



The total locking up sounds like a problem that someone who develops the
software might be able to help with (I am reminded of a bug that Ubuntu
featured where checkarray would completely freeze or reboot certain
systems on Linux 2.6.24 or so). Aside from any bugs that checkarray
function is definitely a pain on a production system.

You can try changing the disks out so that both of them run at 3.0 Gbps,
this may speed up the process. Otherwise I would suggest checking out
the help for /usr/share/mdadm/checkarray and modifying the system cron
job (see /etc/cron.d/mdadm) so the checks are timed and staggered per
array the way you like.

Cheers

---
Ross Halliday
Network Operations
WTC Communications



> -----Original Message-----
> From: George Chelidze [mailto:wrath@geo.net.ge]
> Sent: Monday, January 04, 2010 2:50 AM
> To: debian-isp@lists.debian.org
> Subject: RAID1 problem - server freezes on md data-check
> 
> Hello,
> 
> I'v got an HP ML110 Intel Dual-Core E2160 server with 2 HDDs:
> 
> GB0250EAFYK - HP 250GB 3G SATA 7.2K 3.5" MDL 250 GB SATA Hard Drive
> GB0250C8045 - HP 250GB 7.2K SATA Hard Disk Drive
> 
> So, I use SATA 3.0-Gb/s and SATA 1.5 Gb/s for RAID-1 configuration. I
> have configured 4 MD volumes and it's running fine for some time,
> however every now and then servers freezes. At that time I can ping
the
> server from the network, however I can't ssh into the server, even a
> keyboard us useless, so I have to hard reset the server. Below are the
> last messages from my kern.log:
> 
> Jan  3 00:57:01 barambo1 kernel: [986475.159596] md: data-check of
RAID
> array md0
> Jan  3 00:57:01 barambo1 kernel: [986475.159600] md: minimum
> _guaranteed_  speed: 1000 KB/sec/disk.
> Jan  3 00:57:01 barambo1 kernel: [986475.159602] md: using maximum
> available idle IO bandwidth (but not more than 200000 KB/sec) for
> data-check.
> Jan  3 00:57:01 barambo1 kernel: [986475.159606] md: using 128k
window,
> over a total of 3903680 blocks.
> Jan  3 00:57:01 barambo1 kernel: [986475.162041] md: delaying data-
> check
> of md1 until md0 has finished (they share one or more physical units)
> Jan  3 00:57:01 barambo1 kernel: [986475.164449] md: delaying data-
> check
> of md2 until md0 has finished (they share one or more physical units)
> Jan  3 00:57:01 barambo1 kernel: [986475.164455] md: delaying data-
> check
> of md1 until md2 has finished (they share one or more physical units)
> Jan  3 00:57:01 barambo1 kernel: [986475.166695] md: delaying data-
> check
> of md3 until md0 has finished (they share one or more physical units)
> Jan  3 00:57:01 barambo1 kernel: [986475.166699] md: delaying data-
> check
> of md1 until md3 has finished (they share one or more physical units)
> Jan  3 00:57:01 barambo1 kernel: [986475.166705] md: delaying data-
> check
> of md2 until md3 has finished (they share one or more physical units)
> Jan  3 00:58:13 barambo1 kernel: [986547.257883] md: md0: data-check
> done.
> Jan  3 00:58:13 barambo1 kernel: [986547.276663] md: delaying data-
> check
> of md1 until md3 has finished (they share one or more physical units)
> Jan  3 00:58:13 barambo1 kernel: [986547.276668] md: data-check of
RAID
> array md3
> Jan  3 00:58:13 barambo1 kernel: [986547.276671] md: minimum
> _guaranteed_  speed: 1000 KB/sec/disk.
> Jan  3 00:58:13 barambo1 kernel: [986547.276674] md: using maximum
> available idle IO bandwidth (but not more than 200000 KB/sec) for
> data-check.
> Jan  3 00:58:13 barambo1 kernel: [986547.276678] md: using 128k
window,
> over a total of 122126016 blocks.
> Jan  3 00:58:13 barambo1 kernel: [986547.276681] md: delaying data-
> check
> of md2 until md3 has finished (they share one or more physical units)
> 
> OS is Debian 5.0.3 Lenny stable with linux-image-2.6.30-bpo.2-686
> kernel. I had the same results with linux-image-2.6.26-2-686 stock
> kernel. My basic question is can this happen because I use 2 different
> drives? I have a chance to replace GB0250C8045 with GB0250EAFYK or
> GB0250EAFYK with GB0250C8045 and have 2 identical drives. Is it a good
> idea and will it solve my problem?
> 
> Thank you in advance for any input,
> 
> Best Regards,
> 
> George Chelidze


Reply to: