[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

raid1 issue, somewhat related to recent "debian on big machines"



lupus in fabula as a follow up of my short intervention on raid1 with
my machine to the thread "Debian on big systems".

System: supermicro H8QC8 m.board, two WD Raptor SATA 150GB, Debian
amd64 lenny, raid1

While running an electronic molecular calculation - estimated to four
days time - I noticed by chance on the screen (what is not in the out
file of the calculation) that there was a disk problem. I took some
scattered notes from the scree:

RAID1 conf printout

wd: 1 rd:2

disk0 wd:1 o:0 dev: sda6

disk0 wd:1 o:0 dev: sdb6

md: recovery of raid array md4

minimum guaranteed speed 1000 kB/sec/disk

using max available idle I/O bandwidth but no more than 200000

..........................

Disk failure on sda1, disabling device

Operation continues on 1 devices.

raid sdb1: redirecting sector 262176 to another mirror

RAID1 conf printout

wd:1 rd:2
..........................

disk1, wd:0 0:1 dev:sdb7
=================

Then, the electronic molecular calculation resumed - with all CPUs at
work, as indicated by top - and in its output file there was no trace
of the above problems.

Command:

lshw -class disk

reported:

  *-cdrom
       description: DVD writer
       product: PIONEER DVD-RW DVR-111D
       vendor: Pioneer
       physical id: 0
       bus info: ide@0.0
       logical name: /dev/hda
       version: 1.02
       capabilities: packet atapi cdrom removable nonmagnetic dma lba
iordy pm audio cd-r cd-rw dvd dvd-r
       configuration: mode=udma4 status=nodisc
  *-disk:0
       description: SCSI Disk
       physical id: 0
       bus info: scsi@0:0.0.0
       logical name: /dev/sda
       size: 139GiB (150GB)
  *-disk:1
       description: ATA Disk
       product: WDC WD1500ADFD-0
       vendor: Western Digital
       physical id: 1
       bus info: scsi@1:0.0.0
       logical name: /dev/sdb
       version: 20.0
       serial: WD-WMAP41173675
       size: 139GiB (150GB)
       capabilities: partitioned partitioned:dos
       configuration: ansiversion=5 signature=000b05ba

The description of disk 0 was cryptic to me.
====================

As there have been RAM problems, I also run

lshw -class memory

all DIMMs are correctly reported. No mem problem.
===============

Then I run:

/proc/mdstat

the output was:

Personalities : [raid1]
md6 : active raid1 sda8[2](F) sdb8[1]
      102341952 blocks [2/1] [_U]

md5 : active raid1 sda7[2](F) sdb7[1]
      1951744 blocks [2/1] [_U]

md4 : active raid1 sda6[2](F) sdb6[1]
      2931712 blocks [2/1] [_U]

md3 : active raid1 sda5[2](F) sdb5[1]
      14651136 blocks [2/1] [_U]

md1 : active raid1 sda2[2](F) sdb2[1]
      6835584 blocks [2/1] [_U]

md0 : active raid1 sda1[2](F) sdb1[1]
      2931712 blocks [2/1] [_U]

md2 : active raid1 sda3[2](F) sdb3[1]
      14651200 blocks [2/1] [_U]

unused devices: <none>
===============

I would appreciate advice.
thanks

francesco pietra


Reply to: