[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#526737: md: RAID 1 check makes kernel almost panic



tags 526737 fixed-upstream
thanks

On Sun, May 03, 2009 at 09:31:35AM +0200, Julien Danjou wrote:
> Version: 2.6.29-3

This does not match the version listed below.

> Today was the normal crondate for mdadm to run a check on my 2 raid 1
> devices.
> Problem, when I woke up this morning I found very bad stuff on my
> terminals.
> 
> The check was stuck at 2% according to /proc/mdstat. I tried to run it
> again, but that caused the second backtrace you can see below.

Looks like this is fixed in 303a0e11d0ee136ad8f53f747f3c377daece763b and
also in 2.6.29.2.

> Now, I cannot even check the RAID status since "cat /proc/mdstat" hangs.
> 
> % strace  cat /proc/mdstat
> [...]
> open("/proc/mdstat", O_RDONLY)          = 3
> fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
> read(3, 

The MD subsystem is dead at this time, so such fallouts are likely.

> -- Package-specific info:
> ** Version:
> Linux version 2.6.29-1-amd64 (Debian 2.6.29-2) (waldi@debian.org) (gcc version 4.3.3 (Debian 4.3.3-5) ) #1 SMP Sat Apr 4 16:54:07 UTC 2009
> 
> ** Command line:
> root=/dev/mapper/system-root ro vga=791 splash
> 
> ** Tainted: P D (129)

You use proprietary drivers. We don't support systems in this state.

> ** Kernel log:
> [1404264.039924] md: data-check of RAID array md0
> [1404264.039927] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
> [1404264.039930] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
> [1404264.039934] md: using 128k window, over a total of 48064 blocks.

The check started here.

> [1404264.042298] general protection fault: 0000 [#1] SMP 
> [1404264.042302] last sysfs file: /sys/devices/virtual/block/md1/md/sync_action
> [1404264.042304] CPU 0 
> [1404264.042306] Modules linked in: rfcomm l2cap bluetooth nvidia(P) ipv6 acpi_cpufreq cpufreq_powersave cpufreq_stats cpufreq_userspace cpufreq_conservative nfs lockd nfs_acl auth_rpcgss sunrpc coretemp w83627ehf hwmon_vid firewire_sbp2 loop snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device i2c_i801 snd soundcore snd_page_alloc i2c_core pcspkr serio_raw evdev button ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_snapshot dm_mod raid1 md_mod hid_logitech ff_memless usbhid hid sd_mod crc_t10dif ide_cd_mod cdrom ata_generic uhci_hcd ide_pci_generic ata_piix firewire_ohci firewire_core crc_itu_t ahci atl1 mii libata jmicron ide_core scsi_mod ehci_hcd intel_agp thermal processor fan thermal_sys
> [1404264.042363] Pid: 32532, comm: md0_resync Tainted: P           2.6.29-1-amd64 #1 P5K
> [1404264.042365] RIP: 0010:[<ffffffff8029810a>]  [<ffffffff8029810a>] put_page+0xb/0xbb

It died in put_page with a general protection fault.

Bastian



Reply to: