[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Software RAID broken on Alpha



There is / was a compiler bug that cause raid 1, not to work properly.  I 
think it was fixed in gcc 3.3.  You can either upgrade your compiler to 3.3 
or there is a work around patch in the  kernel-2.4.18-27.7.1hp in the RH-7.2 
updates from HP, which follows after the signature line.

This is a very well known problem.  I am surprised that someone has not added 
this to a Debian release-note / errata somewhere.

I hope this helps.

Best Regards,


--George


diff -urP old/drivers/md/raid1.c new/drivers/md/raid1.c
--- old/drivers/md/raid1.c      Tue Jun 25 04:01:48 2002
+++ new/drivers/md/raid1.c      Mon Oct  7 15:22:39 2002
@@ -473,6 +473,12 @@
                goto rb_out;
        
 
+#if defined(CONFIG_ALPHA) && (__GNUC__ == 2) && (__GNUC_MINOR__ == 96) 
+       /* Work around a compiler bug in gcc 2.96 20000731
+          (Red Hat Linux 7.1 2.96-102) */
+       new_disk = *(volatile int *)&new_disk;
+#endif
+
        /* make sure that disk is operational */
        while( !conf->mirrors[new_disk].operational) {
                if (new_disk <= 0) new_disk = conf->raid_disks;
@@ -526,6 +532,11 @@
        
        /* Find the disk which is closest */
        
+#if defined(CONFIG_ALPHA) && (__GNUC__ == 2) && (__GNUC_MINOR__ == 96) 
+       /* Work around a compiler bug in gcc 2.96 20000731
+          (Red Hat Linux 7.1 2.96-102) */
+       disk = *(volatile int *)&disk;
+#endif
        do {
                if (disk <= 0)
                        disk = conf->raid_disks;



On Saturday 06 December 2003 08:01 am, Joerg Hoh wrote:
> Hi
>
> I have a here a Alphaserver 1000 with some SCSI discs. I run 2 of them as
> RAID 1. I get this error on every boot. I have now kernel 2.4.23, but it
> also happend on a .22
>
> The kernel detects the drives correct:
>
> Attached scsi disk sdb at scsi0, channel 0, id 1, lun 0
> Attached scsi disk sdc at scsi0, channel 0, id 2, lun 0
>
> and loads the md drivers without problems:
>
> md: raid1 personality registered as nr 3
> md: raid5 personality registered as nr 4
> raid5: measuring checksumming speed
>    8regs     :   376.832 MB/sec
>    32regs    :   598.016 MB/sec
>    alpha     :   688.128 MB/sec
>    alpha prefetch:   598.016 MB/sec
> raid5: using function: alpha (688.128 MB/sec)
> md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
> md: Autodetecting RAID arrays.
> md: autorun ...
> md: ... autorun DONE.
>
>
> But when the driver should setup the raid, strange things happen
>
> md: autorun ...
> md: considering sdc1 ...
> md:  adding sdc1 ...
> md:  adding sdb1 ...
> md: created md0
> md: bind<sdb1,1>
> md: bind<sdc1,2>
> md: running: <sdc1><sdb1>
> md: sdc1's event counter: 00000023
> md: sdb1's event counter: 00000023
> md: device name has changed from sdc1 to sdb1 since last import!
> md: bug in file md.c, line 1486
>
> md:     **********************************
> md:     * <COMPLETE RAID STATE PRINTOUT> *
> md:     **********************************
> md0: <sdc1><sdb1> array superblock:
> md:  SB: (V:0.90.0) ID:<f90cc42f.ece4fdac.fb1c4bae.6739380d> CT:3ee8ce2d
> md:     L1 S04184832 ND:1 RD:2 md0 LO:0 CS:4096
> md:     UT:3f999ce0 ST:0 AD:1 WD:1 FD:0 SD:0 CSUM:00005bf4 E:00000023
>      D  0:  DISK<N:0,[dev 00:00](0,0),R:0,S:8>
>      D  1:  DISK<N:1,sdb1(8,17),R:1,S:6>
>      D  2:  DISK<N:2,[dev 00:00](0,0),R:2,S:9>
> md:     THIS:  DISK<N:1,sdc1(8,33),R:1,S:6>
> md: rdev sdc1: O:sdc1, SZ:00000000 F:0 DN:1 <6>md: rdev superblock:
> md:  SB: (V:0.90.0) ID:<f90cc42f.ece4fdac.fb1c4bae.6739380d> CT:3ee8ce2d
> md:     L1 S04184832 ND:1 RD:2 md0 LO:0 CS:4096
> md:     UT:3f999ce0 ST:0 AD:1 WD:1 FD:0 SD:0 CSUM:00005bf4 E:00000023
>      D  0:  DISK<N:0,[dev 00:00](0,0),R:0,S:8>
>      D  1:  DISK<N:1,sdc1(8,33),R:1,S:6>
>      D  2:  DISK<N:2,[dev 00:00](0,0),R:2,S:9>
> md:     THIS:  DISK<N:1,sdc1(8,33),R:1,S:6>
> md: rdev sdb1: O:sdc1, SZ:00000000 F:0 DN:1 <6>md: rdev superblock:
> md:  SB: (V:0.90.0) ID:<f90cc42f.ece4fdac.fb1c4bae.6739380d> CT:3ee8ce2d
> md:     L1 S04184832 ND:1 RD:2 md0 LO:0 CS:4096
> md:     UT:3f999ce0 ST:0 AD:1 WD:1 FD:0 SD:0 CSUM:00005bf4 E:00000023
>      D  0:  DISK<N:0,[dev 00:00](0,0),R:0,S:8>
>      D  1:  DISK<N:1,sdc1(8,33),R:1,S:6>
>      D  2:  DISK<N:2,[dev 00:00](0,0),R:2,S:9>
> md:     THIS:  DISK<N:1,sdb1(8,17),R:1,S:6>
> md:     **********************************
>
> md: bug in file md.c, line 1650
>
> md:     **********************************
> md:     * <COMPLETE RAID STATE PRINTOUT> *
> md:     **********************************
> md0: <sdc1><sdb1> array superblock:
> md:  SB: (V:0.90.0) ID:<f90cc42f.ece4fdac.fb1c4bae.6739380d> CT:3ee8ce2d
> md:     L1 S04184832 ND:1 RD:2 md0 LO:0 CS:4096
> md:     UT:3f999ce0 ST:0 AD:1 WD:1 FD:0 SD:0 CSUM:00005bf4 E:00000023
>      D  0:  DISK<N:0,[dev 00:00](0,0),R:0,S:8>
>      D  1:  DISK<N:1,sdb1(8,17),R:1,S:6>
>      D  2:  DISK<N:2,[dev 00:00](0,0),R:2,S:9>
> md:     THIS:  DISK<N:1,sdc1(8,33),R:1,S:6>
> md: rdev sdc1: O:sdc1, SZ:00000000 F:0 DN:1 <6>md: rdev superblock:
> md:  SB: (V:0.90.0) ID:<f90cc42f.ece4fdac.fb1c4bae.6739380d> CT:3ee8ce2d
> md:     L1 S04184832 ND:1 RD:2 md0 LO:0 CS:4096
> md:     UT:3f999ce0 ST:0 AD:1 WD:1 FD:0 SD:0 CSUM:00005bf4 E:00000023
>      D  0:  DISK<N:0,[dev 00:00](0,0),R:0,S:8>
>      D  1:  DISK<N:1,sdc1(8,33),R:1,S:6>
>      D  2:  DISK<N:2,[dev 00:00](0,0),R:2,S:9>
> md:     THIS:  DISK<N:1,sdc1(8,33),R:1,S:6>
> md: rdev sdb1: O:sdc1, SZ:00000000 F:0 DN:1 <6>md: rdev superblock:
> md:  SB: (V:0.90.0) ID:<f90cc42f.ece4fdac.fb1c4bae.6739380d> CT:3ee8ce2d
> md:     L1 S04184832 ND:1 RD:2 md0 LO:0 CS:4096
> md:     UT:3f999ce0 ST:0 AD:1 WD:1 FD:0 SD:0 CSUM:00005bf4 E:00000023
>      D  0:  DISK<N:0,[dev 00:00](0,0),R:0,S:8>
>      D  1:  DISK<N:1,sdc1(8,33),R:1,S:6>
>      D  2:  DISK<N:2,[dev 00:00](0,0),R:2,S:9>
> md:     THIS:  DISK<N:1,sdb1(8,17),R:1,S:6>
> md:     **********************************
>
> md :do_md_run() returned -22
> md: md0 stopped.
> md: unbind<sdc1,1>
> md: export_rdev(sdc1)
> md: unbind<sdb1,0>
> md: export_rdev(sdb1)
> md: ... autorun DONE.
>
>
> I don't know who this came (a guy from SuSE told me, that they tested the
> md-code thorughly on many plattforms but he doesn't know exactly, if alpha
> was among these).
>
> It could be a broken hard drive, but I don't think so.
>
> Jörg



Reply to: