Hi, did you had time time to look into this? I am looking into the kernel messages before and after the BUG and I think I see interesting things. So follows in attach the kernel messages for the day of the BUG. Before the BUG I see: - a group "I/O error, dev cciss/c1d3" followed by "read error corrected", - just before the BUG there is a "read error NOT corrected", "Disk failure on cciss/c1d3p1, disabling device." and "Operation continuing on 5 devices." After the BUG: - there are almost 6 hours without kernel messages, and then again "I/O error, dev cciss/c1d3", on the sectors 64, 128 and 272. Do this sectors belong to the metadata superblock version 1.2, or is the beginning of user data? - More 20 hours without kernel messages until the reset for a forced reboot. I am doing my best to reproduce the BUG but without avail. Jose Calhariz On Tue, Jun 05, 2012 at 12:29:11PM -0500, Jonathan Nieder wrote: > Jose Manuel dos Santos Calhariz wrote: > > > I have 5 systems with a similar setup and only one failed, maybe > > because of the failing disk. I will use one of the systems to try to > > reproduce the bug, before triyng a new kernel. > > Nice, thanks. > > [...] > > In an attach is the boot log of kernel that gave the BUG, > > 2.6.32-41squeeze2. Now the machine is running 2.6.32-45. > > Perfect. This should make it much easier for someone to analyze the > trace in a quieter moment. > > We are now in the pre-freeze frenzy :), but I would be happy to look > at this more closely some time soon (e.g., this coming weekend). > Please feel free to ping me if I don't respond by then. > > Hope that helps, > Jonathan > > -- -- "Tanto na minha vida futebolística quanto com a minha vida ser humana..." -- Nunes, ex-atacante do Flamengo, em uma entrevista antes do jogo de despedida do Zico
Jun 3 00:57:01 afs04 kernel: md: data-check of RAID array md2 Jun 3 00:57:01 afs04 kernel: md: minimum _guaranteed_ speed: 1000 KB/sec/disk. Jun 3 00:57:01 afs04 kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check. Jun 3 00:57:01 afs04 kernel: md: using 128k window, over a total of 244162304 blocks. Jun 3 00:57:01 afs04 kernel: md: data-check of RAID array md3 Jun 3 00:57:01 afs04 kernel: md: minimum _guaranteed_ speed: 1000 KB/sec/disk. Jun 3 00:57:01 afs04 kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check. Jun 3 00:57:01 afs04 kernel: md: using 128k window, over a total of 244162304 blocks. Jun 3 00:57:19 afs04 kernel: cciss: cmd f6000250 has CHECK CONDITION sense key = 0x3 Jun 3 00:57:19 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 208888 Jun 3 00:57:19 afs04 kernel: __ratelimit: 6 callbacks suppressed Jun 3 00:57:19 afs04 kernel: raid5:md2: read error corrected (8 sectors at 208856 on cciss/c1d3p1) Jun 3 00:57:19 afs04 kernel: raid5:md2: read error corrected (8 sectors at 208864 on cciss/c1d3p1) Jun 3 00:57:19 afs04 kernel: raid5:md2: read error corrected (8 sectors at 208872 on cciss/c1d3p1) Jun 3 00:57:19 afs04 kernel: raid5:md2: read error corrected (8 sectors at 208880 on cciss/c1d3p1) Jun 3 00:57:20 afs04 kernel: raid5:md2: read error corrected (8 sectors at 208888 on cciss/c1d3p1) Jun 3 00:57:20 afs04 kernel: raid5:md2: read error corrected (8 sectors at 208896 on cciss/c1d3p1) Jun 3 00:57:20 afs04 kernel: raid5:md2: read error corrected (8 sectors at 208904 on cciss/c1d3p1) Jun 3 00:57:20 afs04 kernel: raid5:md2: read error corrected (8 sectors at 208912 on cciss/c1d3p1) Jun 3 00:57:20 afs04 kernel: raid5:md2: read error corrected (8 sectors at 208920 on cciss/c1d3p1) Jun 3 00:57:20 afs04 kernel: raid5:md2: read error corrected (8 sectors at 208928 on cciss/c1d3p1) Jun 3 00:57:30 afs04 kernel: cciss: cmd f6000940 has CHECK CONDITION sense key = 0x3 Jun 3 00:57:30 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 403992 Jun 3 00:57:31 afs04 kernel: __ratelimit: 21 callbacks suppressed Jun 3 00:57:31 afs04 kernel: raid5:md2: read error corrected (8 sectors at 403960 on cciss/c1d3p1) Jun 3 00:57:31 afs04 kernel: raid5:md2: read error corrected (8 sectors at 403976 on cciss/c1d3p1) Jun 3 00:57:31 afs04 kernel: raid5:md2: read error corrected (8 sectors at 403968 on cciss/c1d3p1) Jun 3 00:57:31 afs04 kernel: raid5:md2: read error corrected (8 sectors at 403984 on cciss/c1d3p1) Jun 3 00:57:31 afs04 kernel: raid5:md2: read error corrected (8 sectors at 403992 on cciss/c1d3p1) Jun 3 00:57:31 afs04 kernel: raid5:md2: read error corrected (8 sectors at 404000 on cciss/c1d3p1) Jun 3 00:57:31 afs04 kernel: raid5:md2: read error corrected (8 sectors at 404008 on cciss/c1d3p1) Jun 3 00:57:31 afs04 kernel: raid5:md2: read error corrected (8 sectors at 404016 on cciss/c1d3p1) Jun 3 00:57:31 afs04 kernel: raid5:md2: read error corrected (8 sectors at 404024 on cciss/c1d3p1) Jun 3 00:57:31 afs04 kernel: raid5:md2: read error corrected (8 sectors at 404032 on cciss/c1d3p1) Jun 3 00:57:46 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 00:57:46 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 682880 Jun 3 00:57:46 afs04 kernel: __ratelimit: 21 callbacks suppressed Jun 3 00:57:46 afs04 kernel: raid5:md2: read error corrected (8 sectors at 682848 on cciss/c1d3p1) Jun 3 00:57:46 afs04 kernel: raid5:md2: read error corrected (8 sectors at 682864 on cciss/c1d3p1) Jun 3 00:57:46 afs04 kernel: raid5:md2: read error corrected (8 sectors at 682856 on cciss/c1d3p1) Jun 3 00:57:46 afs04 kernel: raid5:md2: read error corrected (8 sectors at 682872 on cciss/c1d3p1) Jun 3 00:57:48 afs04 kernel: raid5:md2: read error corrected (8 sectors at 682880 on cciss/c1d3p1) Jun 3 00:57:48 afs04 kernel: raid5:md2: read error corrected (8 sectors at 682888 on cciss/c1d3p1) Jun 3 00:57:48 afs04 kernel: raid5:md2: read error corrected (8 sectors at 682896 on cciss/c1d3p1) Jun 3 00:57:48 afs04 kernel: raid5:md2: read error corrected (8 sectors at 682904 on cciss/c1d3p1) Jun 3 00:57:48 afs04 kernel: raid5:md2: read error corrected (8 sectors at 682912 on cciss/c1d3p1) Jun 3 00:57:48 afs04 kernel: raid5:md2: read error corrected (8 sectors at 682920 on cciss/c1d3p1) Jun 3 00:57:54 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 00:57:54 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 710464 Jun 3 00:57:54 afs04 kernel: __ratelimit: 21 callbacks suppressed Jun 3 00:57:54 afs04 kernel: raid5:md2: read error corrected (8 sectors at 710432 on cciss/c1d3p1) Jun 3 00:57:54 afs04 kernel: raid5:md2: read error corrected (8 sectors at 710440 on cciss/c1d3p1) Jun 3 00:57:54 afs04 kernel: raid5:md2: read error corrected (8 sectors at 710448 on cciss/c1d3p1) Jun 3 00:57:54 afs04 kernel: raid5:md2: read error corrected (8 sectors at 710456 on cciss/c1d3p1) Jun 3 00:57:54 afs04 kernel: raid5:md2: read error corrected (8 sectors at 710464 on cciss/c1d3p1) Jun 3 00:57:54 afs04 kernel: raid5:md2: read error corrected (8 sectors at 710472 on cciss/c1d3p1) Jun 3 00:57:54 afs04 kernel: raid5:md2: read error corrected (8 sectors at 710480 on cciss/c1d3p1) Jun 3 00:57:54 afs04 kernel: raid5:md2: read error corrected (8 sectors at 710488 on cciss/c1d3p1) Jun 3 00:57:54 afs04 kernel: raid5:md2: read error corrected (8 sectors at 710496 on cciss/c1d3p1) Jun 3 00:57:54 afs04 kernel: raid5:md2: read error corrected (8 sectors at 710504 on cciss/c1d3p1) Jun 3 00:58:33 afs04 kernel: cciss: cmd f6000940 has CHECK CONDITION sense key = 0x3 Jun 3 00:58:33 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 1506040 Jun 3 00:58:35 afs04 kernel: __ratelimit: 21 callbacks suppressed Jun 3 00:58:35 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1506008 on cciss/c1d3p1) Jun 3 00:58:35 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1506024 on cciss/c1d3p1) Jun 3 00:58:35 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1506016 on cciss/c1d3p1) Jun 3 00:58:35 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1506032 on cciss/c1d3p1) Jun 3 00:58:35 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1506040 on cciss/c1d3p1) Jun 3 00:58:35 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1506048 on cciss/c1d3p1) Jun 3 00:58:35 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1506056 on cciss/c1d3p1) Jun 3 00:58:35 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1506064 on cciss/c1d3p1) Jun 3 00:58:35 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1506072 on cciss/c1d3p1) Jun 3 00:58:35 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1506080 on cciss/c1d3p1) Jun 3 00:58:40 afs04 kernel: cciss: cmd f6000250 has CHECK CONDITION sense key = 0x3 Jun 3 00:58:40 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 1561408 Jun 3 00:58:40 afs04 kernel: __ratelimit: 21 callbacks suppressed Jun 3 00:58:40 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1561376 on cciss/c1d3p1) Jun 3 00:58:40 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1561384 on cciss/c1d3p1) Jun 3 00:58:40 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1561392 on cciss/c1d3p1) Jun 3 00:58:40 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1561400 on cciss/c1d3p1) Jun 3 00:58:41 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1561408 on cciss/c1d3p1) Jun 3 00:58:41 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1561416 on cciss/c1d3p1) Jun 3 00:58:41 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1561424 on cciss/c1d3p1) Jun 3 00:58:41 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1561432 on cciss/c1d3p1) Jun 3 00:58:41 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1561440 on cciss/c1d3p1) Jun 3 00:58:41 afs04 kernel: raid5:md2: read error corrected (8 sectors at 1561448 on cciss/c1d3p1) Jun 3 00:59:35 afs04 kernel: cciss: cmd f6000940 has CHECK CONDITION sense key = 0x3 Jun 3 00:59:35 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 3122808 Jun 3 00:59:36 afs04 kernel: __ratelimit: 21 callbacks suppressed Jun 3 00:59:36 afs04 kernel: raid5:md2: read error corrected (8 sectors at 3122776 on cciss/c1d3p1) Jun 3 00:59:36 afs04 kernel: raid5:md2: read error corrected (8 sectors at 3122784 on cciss/c1d3p1) Jun 3 00:59:36 afs04 kernel: raid5:md2: read error corrected (8 sectors at 3122792 on cciss/c1d3p1) Jun 3 00:59:36 afs04 kernel: raid5:md2: read error corrected (8 sectors at 3122800 on cciss/c1d3p1) Jun 3 00:59:36 afs04 kernel: raid5:md2: read error corrected (8 sectors at 3122808 on cciss/c1d3p1) Jun 3 00:59:36 afs04 kernel: raid5:md2: read error corrected (8 sectors at 3122816 on cciss/c1d3p1) Jun 3 00:59:36 afs04 kernel: raid5:md2: read error corrected (8 sectors at 3122824 on cciss/c1d3p1) Jun 3 00:59:36 afs04 kernel: raid5:md2: read error corrected (8 sectors at 3122832 on cciss/c1d3p1) Jun 3 00:59:36 afs04 kernel: raid5:md2: read error corrected (8 sectors at 3122840 on cciss/c1d3p1) Jun 3 00:59:36 afs04 kernel: raid5:md2: read error corrected (8 sectors at 3122848 on cciss/c1d3p1) Jun 3 01:01:32 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 01:01:32 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 5826048 Jun 3 01:01:32 afs04 kernel: __ratelimit: 21 callbacks suppressed Jun 3 01:01:32 afs04 kernel: raid5:md2: read error corrected (8 sectors at 5826016 on cciss/c1d3p1) Jun 3 01:01:32 afs04 kernel: raid5:md2: read error corrected (8 sectors at 5826024 on cciss/c1d3p1) Jun 3 01:01:32 afs04 kernel: raid5:md2: read error corrected (8 sectors at 5826032 on cciss/c1d3p1) Jun 3 01:01:32 afs04 kernel: raid5:md2: read error corrected (8 sectors at 5826040 on cciss/c1d3p1) Jun 3 01:01:32 afs04 kernel: raid5:md2: read error corrected (8 sectors at 5826048 on cciss/c1d3p1) Jun 3 01:01:32 afs04 kernel: raid5:md2: read error corrected (8 sectors at 5826056 on cciss/c1d3p1) Jun 3 01:01:32 afs04 kernel: raid5:md2: read error corrected (8 sectors at 5826064 on cciss/c1d3p1) Jun 3 01:01:32 afs04 kernel: raid5:md2: read error corrected (8 sectors at 5826072 on cciss/c1d3p1) Jun 3 01:01:32 afs04 kernel: raid5:md2: read error corrected (8 sectors at 5826080 on cciss/c1d3p1) Jun 3 01:01:32 afs04 kernel: raid5:md2: read error corrected (8 sectors at 5826088 on cciss/c1d3p1) Jun 3 01:03:05 afs04 kernel: cciss: cmd f6000de0 has CHECK CONDITION sense key = 0x3 Jun 3 01:03:05 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 8126112 Jun 3 01:03:07 afs04 kernel: __ratelimit: 21 callbacks suppressed Jun 3 01:03:07 afs04 kernel: raid5:md2: read error corrected (8 sectors at 8126080 on cciss/c1d3p1) Jun 3 01:03:07 afs04 kernel: raid5:md2: read error corrected (8 sectors at 8126088 on cciss/c1d3p1) Jun 3 01:03:07 afs04 kernel: raid5:md2: read error corrected (8 sectors at 8126104 on cciss/c1d3p1) Jun 3 01:03:07 afs04 kernel: raid5:md2: read error corrected (8 sectors at 8126096 on cciss/c1d3p1) Jun 3 01:03:07 afs04 kernel: raid5:md2: read error corrected (8 sectors at 8126112 on cciss/c1d3p1) Jun 3 01:03:07 afs04 kernel: raid5:md2: read error corrected (8 sectors at 8126120 on cciss/c1d3p1) Jun 3 01:03:07 afs04 kernel: raid5:md2: read error corrected (8 sectors at 8126128 on cciss/c1d3p1) Jun 3 01:03:07 afs04 kernel: raid5:md2: read error corrected (8 sectors at 8126136 on cciss/c1d3p1) Jun 3 01:03:07 afs04 kernel: raid5:md2: read error corrected (8 sectors at 8126144 on cciss/c1d3p1) Jun 3 01:03:07 afs04 kernel: raid5:md2: read error corrected (8 sectors at 8126152 on cciss/c1d3p1) Jun 3 01:04:51 afs04 kernel: cciss: cmd f6000940 has CHECK CONDITION sense key = 0x3 Jun 3 01:04:51 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 11010696 Jun 3 01:04:51 afs04 kernel: __ratelimit: 21 callbacks suppressed Jun 3 01:04:51 afs04 kernel: raid5:md2: read error corrected (8 sectors at 11010664 on cciss/c1d3p1) Jun 3 01:04:51 afs04 kernel: raid5:md2: read error corrected (8 sectors at 11010672 on cciss/c1d3p1) Jun 3 01:04:51 afs04 kernel: raid5:md2: read error corrected (8 sectors at 11010680 on cciss/c1d3p1) Jun 3 01:04:51 afs04 kernel: raid5:md2: read error corrected (8 sectors at 11010688 on cciss/c1d3p1) Jun 3 01:04:51 afs04 kernel: raid5:md2: read error corrected (8 sectors at 11010696 on cciss/c1d3p1) Jun 3 01:04:51 afs04 kernel: raid5:md2: read error corrected (8 sectors at 11010704 on cciss/c1d3p1) Jun 3 01:04:51 afs04 kernel: raid5:md2: read error corrected (8 sectors at 11010712 on cciss/c1d3p1) Jun 3 01:04:51 afs04 kernel: raid5:md2: read error corrected (8 sectors at 11010720 on cciss/c1d3p1) Jun 3 01:04:51 afs04 kernel: raid5:md2: read error corrected (8 sectors at 11010728 on cciss/c1d3p1) Jun 3 01:04:51 afs04 kernel: raid5:md2: read error corrected (8 sectors at 11010736 on cciss/c1d3p1) Jun 3 01:22:14 afs04 kernel: cciss: cmd f6001970 has CHECK CONDITION sense key = 0x3 Jun 3 01:22:14 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 46048536 Jun 3 01:22:16 afs04 kernel: __ratelimit: 21 callbacks suppressed Jun 3 01:22:16 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46048512 on cciss/c1d3p1) Jun 3 01:22:16 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46048520 on cciss/c1d3p1) Jun 3 01:22:16 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46048504 on cciss/c1d3p1) Jun 3 01:22:16 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46048528 on cciss/c1d3p1) Jun 3 01:22:16 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46048536 on cciss/c1d3p1) Jun 3 01:22:16 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46048544 on cciss/c1d3p1) Jun 3 01:22:16 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46048552 on cciss/c1d3p1) Jun 3 01:22:16 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46048560 on cciss/c1d3p1) Jun 3 01:22:16 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46048568 on cciss/c1d3p1) Jun 3 01:22:16 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46048576 on cciss/c1d3p1) Jun 3 01:22:21 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 01:22:21 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 46103096 Jun 3 01:22:22 afs04 kernel: __ratelimit: 21 callbacks suppressed Jun 3 01:22:22 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46103064 on cciss/c1d3p1) Jun 3 01:22:22 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46103080 on cciss/c1d3p1) Jun 3 01:22:22 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46103072 on cciss/c1d3p1) Jun 3 01:22:22 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46103088 on cciss/c1d3p1) Jun 3 01:22:22 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46103096 on cciss/c1d3p1) Jun 3 01:22:22 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46103104 on cciss/c1d3p1) Jun 3 01:22:22 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46103112 on cciss/c1d3p1) Jun 3 01:22:22 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46103120 on cciss/c1d3p1) Jun 3 01:22:22 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46103128 on cciss/c1d3p1) Jun 3 01:22:22 afs04 kernel: raid5:md2: read error corrected (8 sectors at 46103136 on cciss/c1d3p1) Jun 3 01:26:02 afs04 kernel: cciss: cmd f6000de0 has CHECK CONDITION sense key = 0x3 Jun 3 01:26:02 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 52402552 Jun 3 01:26:02 afs04 kernel: __ratelimit: 21 callbacks suppressed Jun 3 01:26:02 afs04 kernel: raid5:md2: read error corrected (8 sectors at 52402520 on cciss/c1d3p1) Jun 3 01:26:02 afs04 kernel: raid5:md2: read error corrected (8 sectors at 52402528 on cciss/c1d3p1) Jun 3 01:26:02 afs04 kernel: raid5:md2: read error corrected (8 sectors at 52402536 on cciss/c1d3p1) Jun 3 01:26:02 afs04 kernel: raid5:md2: read error corrected (8 sectors at 52402544 on cciss/c1d3p1) Jun 3 01:26:02 afs04 kernel: raid5:md2: read error corrected (8 sectors at 52402552 on cciss/c1d3p1) Jun 3 01:26:02 afs04 kernel: raid5:md2: read error corrected (8 sectors at 52402560 on cciss/c1d3p1) Jun 3 01:26:02 afs04 kernel: raid5:md2: read error corrected (8 sectors at 52402568 on cciss/c1d3p1) Jun 3 01:26:02 afs04 kernel: raid5:md2: read error corrected (8 sectors at 52402576 on cciss/c1d3p1) Jun 3 01:26:02 afs04 kernel: raid5:md2: read error corrected (8 sectors at 52402584 on cciss/c1d3p1) Jun 3 01:26:02 afs04 kernel: raid5:md2: read error corrected (8 sectors at 52402592 on cciss/c1d3p1) Jun 3 01:35:50 afs04 kernel: cciss: cmd f60006f0 has CHECK CONDITION sense key = 0x3 Jun 3 01:35:50 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 73343248 Jun 3 01:35:53 afs04 kernel: __ratelimit: 21 callbacks suppressed Jun 3 01:35:53 afs04 kernel: raid5:md2: read error corrected (8 sectors at 73343216 on cciss/c1d3p1) Jun 3 01:35:53 afs04 kernel: raid5:md2: read error corrected (8 sectors at 73343224 on cciss/c1d3p1) Jun 3 01:35:53 afs04 kernel: raid5:md2: read error corrected (8 sectors at 73343232 on cciss/c1d3p1) Jun 3 01:35:53 afs04 kernel: raid5:md2: read error corrected (8 sectors at 73343240 on cciss/c1d3p1) Jun 3 01:35:56 afs04 kernel: cciss: cmd f6000de0 has CHECK CONDITION sense key = 0x3 Jun 3 01:35:56 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 73343280 Jun 3 01:35:56 afs04 kernel: raid5:md2: read error NOT corrected!! (sector 73343248 on cciss/c1d3p1). Jun 3 01:35:56 afs04 kernel: raid5: Disk failure on cciss/c1d3p1, disabling device. Jun 3 01:35:56 afs04 kernel: raid5: Operation continuing on 5 devices. Jun 3 01:35:56 afs04 kernel: raid5:md2: read error NOT corrected!! (sector 73343256 on cciss/c1d3p1). Jun 3 01:35:56 afs04 kernel: raid5:md2: read error NOT corrected!! (sector 73343264 on cciss/c1d3p1). Jun 3 01:35:56 afs04 kernel: raid5:md2: read error NOT corrected!! (sector 73343272 on cciss/c1d3p1). Jun 3 01:35:56 afs04 kernel: raid5:md2: read error NOT corrected!! (sector 73343280 on cciss/c1d3p1). Jun 3 01:35:56 afs04 kernel: raid5:md2: read error NOT corrected!! (sector 73343288 on cciss/c1d3p1). Jun 3 01:35:56 afs04 kernel: ------------[ cut here ]------------ Jun 3 01:35:56 afs04 kernel: kernel BUG at /tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_none/drivers/md/raid5.c:2764! Jun 3 01:35:56 afs04 kernel: invalid opcode: 0000 [#1] SMP Jun 3 01:35:56 afs04 kernel: last sysfs file: /sys/devices/pci0000:00/0000:00:1c.0/0000:02:01.0/cciss0/c0d0/block/cciss!c0d0/removable Jun 3 01:35:56 afs04 kernel: Modules linked in: btrfs zlib_deflate crc32c libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs reiserfs ext4 jbd2 crc16 openafs(P) lp parport_pc parport joydev st sd_mod crc_t10dif ext2 loop tun xt_multiport xfs exportfs 8021q garp stp ip6table_filter ip6_tables iptable_filter ip_tables x_tables ide_generic ide_gd_mod ide_cd_mod ide_core snd_pcm snd_timer hpilo snd soundcore snd_page_alloc hpwdt e752x_edac shpchp rng_core i6300esb edac_core pci_hotplug pcspkr container processor evdev button psmouse serio_raw ext3 jbd mbcache dm_mod raid456 md_mod async_raid6_recov async_pq usbhid hid raid6_pq async_xor xor async_memcpy async_tx sg sr_mod cdrom ata_generic thermal uhci_hcd cciss tg3 floppy ata_piix ehci_hcd libata e1000 usbcore libphy scsi_mod nls_base thermal_sys [last unloaded: openafs] Jun 3 01:35:56 afs04 kernel: Jun 3 01:35:56 afs04 kernel: Pid: 743, comm: md2_raid6 Tainted: P (2.6.32-5-686 #1) ProLiant DL360 G4 Jun 3 01:35:56 afs04 kernel: EIP: 0060:[<f818c811>] EFLAGS: 00010297 CPU: 3 Jun 3 01:35:56 afs04 kernel: EIP is at handle_stripe+0x89d/0x173e [raid456] Jun 3 01:35:56 afs04 kernel: EAX: 00000005 EBX: 00000002 ECX: 00000003 EDX: 00000001 Jun 3 01:35:56 afs04 kernel: ESI: f6394000 EDI: 00000003 EBP: f6394028 ESP: f58d5e6c Jun 3 01:35:56 afs04 kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Jun 3 01:35:56 afs04 kernel: Process md2_raid6 (pid: 743, ti=f58d4000 task=f6569980 task.ti=f58d4000) Jun 3 01:35:56 afs04 kernel: Stack: Jun 3 01:35:56 afs04 kernel: e6fde3e6 c2988138 00000006 f61c8e00 00000006 0002d995 00020003 00000000 Jun 3 01:35:56 afs04 kernel: <0> c2988138 f4cbc86c f65699ac 000f0e67 00000000 f639431c 00000005 fffffffc Jun 3 01:35:56 afs04 kernel: <0> f4cbc86c c1025461 00000000 00000000 00000002 00000005 00988100 c127a45c Jun 3 01:35:56 afs04 kernel: Call Trace: Jun 3 01:35:56 afs04 kernel: [<c1025461>] ? check_preempt_wakeup+0x196/0x202 Jun 3 01:35:56 afs04 kernel: [<f818d9fb>] ? raid5d+0x349/0x389 [raid456] Jun 3 01:35:56 afs04 kernel: [<c103b623>] ? del_timer_sync+0xa/0x14 Jun 3 01:35:56 afs04 kernel: [<c103b6cb>] ? process_timeout+0x0/0x5 Jun 3 01:35:56 afs04 kernel: [<f816206e>] ? md_thread+0xe1/0xf8 [md_mod] Jun 3 01:35:56 afs04 kernel: [<c104433a>] ? autoremove_wake_function+0x0/0x2d Jun 3 01:35:56 afs04 kernel: [<f8161f8d>] ? md_thread+0x0/0xf8 [md_mod] Jun 3 01:35:56 afs04 kernel: [<c1044108>] ? kthread+0x61/0x66 Jun 3 01:35:56 afs04 kernel: [<c10440a7>] ? kthread+0x0/0x66 Jun 3 01:35:56 afs04 kernel: [<c1003d47>] ? kernel_thread_helper+0x7/0x10 Jun 3 01:35:56 afs04 kernel: Code: e9 9b 01 00 00 83 7c 24 7c 02 74 04 0f 0b eb fe f6 46 28 10 c7 46 3c 00 00 00 00 0f 85 7f 01 00 00 8b 44 24 38 39 44 24 70 7d 04 <0f> 0b eb fe 83 7c 24 7c 02 75 20 6b 84 24 a8 00 00 00 78 ff 44 Jun 3 01:35:56 afs04 kernel: EIP: [<f818c811>] handle_stripe+0x89d/0x173e [raid456] SS:ESP 0068:f58d5e6c Jun 3 01:35:56 afs04 kernel: ---[ end trace b6f4aa295d5e4948 ]--- Jun 3 02:59:50 afs04 kernel: md: md3: data-check done. Jun 3 06:16:21 afs04 kernel: afs: Lost contact with volume location server 193.136.128.36 in cell ist.utl.pt Jun 3 06:16:21 afs04 kernel: afs: Lost contact with volume location server 193.136.128.36 in cell ist.utl.pt Jun 3 06:17:18 afs04 kernel: afs: Lost contact with file server 193.136.128.36 in cell ist.utl.pt (all multi-homed ip addresses down for the server) Jun 3 06:17:18 afs04 kernel: afs: Lost contact with file server 193.136.128.36 in cell ist.utl.pt (all multi-homed ip addresses down for the server) Jun 3 07:35:21 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:21 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 128 Jun 3 07:35:21 afs04 kernel: __ratelimit: 21 callbacks suppressed Jun 3 07:35:21 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 16 Jun 3 07:35:22 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:22 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 128 Jun 3 07:35:22 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 16 Jun 3 07:35:23 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:23 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 128 Jun 3 07:35:23 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 16 Jun 3 07:35:24 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:24 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 128 Jun 3 07:35:24 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 16 Jun 3 07:35:25 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:25 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 128 Jun 3 07:35:25 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 16 Jun 3 07:35:28 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:28 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 64 Jun 3 07:35:28 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 8 Jun 3 07:35:28 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:28 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 64 Jun 3 07:35:28 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 8 Jun 3 07:35:29 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:29 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 64 Jun 3 07:35:29 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 8 Jun 3 07:35:30 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:30 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 64 Jun 3 07:35:30 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 8 Jun 3 07:35:31 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:31 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 64 Jun 3 07:35:31 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 8 Jun 3 07:35:32 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:32 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 64 Jun 3 07:35:32 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 8 Jun 3 07:35:33 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:33 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 64 Jun 3 07:35:33 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 8 Jun 3 07:35:33 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:33 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 64 Jun 3 07:35:33 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 8 Jun 3 07:35:34 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:34 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 64 Jun 3 07:35:34 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 8 Jun 3 07:35:35 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:35 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 64 Jun 3 07:35:35 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 8 Jun 3 07:35:38 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:38 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 272 Jun 3 07:35:38 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 34 Jun 3 07:35:39 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:39 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 272 Jun 3 07:35:39 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 34 Jun 3 07:35:40 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:40 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 128 Jun 3 07:35:40 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 16 Jun 3 07:35:41 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:41 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 128 Jun 3 07:35:41 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 16 Jun 3 07:35:42 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:42 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 128 Jun 3 07:35:42 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 16 Jun 3 07:35:42 afs04 kernel: cciss: cmd f6000000 has CHECK CONDITION sense key = 0x3 Jun 3 07:35:42 afs04 kernel: end_request: I/O error, dev cciss/c1d3, sector 128 Jun 3 07:35:42 afs04 kernel: Buffer I/O error on device cciss/c1d3, logical block 16 Jun 4 03:19:04 afs04 kernel: klogd 1.5.0#6, log source = /proc/kmsg started. Jun 4 03:19:04 afs04 kernel: [ 0.000000] Initializing cgroup subsys cpuset Jun 4 03:19:04 afs04 kernel: [ 0.000000] Initializing cgroup subsys cpu
Attachment:
signature.asc
Description: Digital signature