Bug#506586: linux-2.6: Kernel BUG, possibly related
Hello,
the bug froze my system almost reproducible.
I wanted to use LVM snapshots on our samba server for "hot backup" purposes.
We have a cluster of two servers, DRBD is used for replicating the data.
The two servers are connected via a separate GBit wire.
The plan was to create a snapshot on the secondary node every hour and keep
them for 3 hours (in case a colleague destroys accidentally a document he
was working on the whole day - so a whole day's work could be saved).
I used a script to create LVM-snapshots with a size of 20 GB every hour. The
size of the snapshotted LVM-partition is 2500 GB (2.44 TB).
After the third snapshot was created, the load got high up to 5.
Then the system froze with a kernel error:
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] ------------[ cut here ]------------
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] kernel BUG at mm/slab.c:601!
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] invalid opcode: 0000 [#1] SMP
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] Modules linked in: dm_snapshot dm_mod ipv6 drbd cn loop snd_pcm snd_timer snd soundcore snd_page_alloc iTC
O_wdt pcspkr rng_core i2c_i801 i2c_core shpchp pci_hotplug button intel_agp agpgart evdev ext3 jbd mbcache ide_cd_mod cdrom ide_pci_generic sd_mod ata_pi
ix piix ide_core floppy ahci ata_generic 3w_9xxx e1000 libata scsi_mod ehci_hcd uhci_hcd dock usbcore thermal processor fan thermal_sys
Jul 29 16:23:42 cifs2 kernel: [2369743.510965]
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] Pid: 17626, comm: kcopyd Not tainted (2.6.26-2-686 #1)
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] EIP: 0060:[<c0171556>] EFLAGS: 00010046 CPU: 0
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] EIP is at free_block+0x4e/0xf4
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] EAX: 00000000 EBX: 0000003c ECX: f0161d48 EDX: c1602c20
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] ESI: f4597000 EDI: e1d4dc40 EBP: f6d49c00 ESP: c489deb0
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] Process kcopyd (pid: 17626, ti=c489c000 task=f7785560 task.ti=c489c000)
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] Stack: 00000003 00000000 0000003c f6cc7414 00000036 0000003c 000001e0 f6d49c00
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] e1d4dc40 c017137a 00000000 f6cc7400 f6cc7400 00000246 f01c6248 00000000
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] c0171460 e1d4d780 f01c6248 f4791e80 c0158bf1 c49fb948 f01c6248 f4791e80
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] Call Trace:
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<c017137a>] cache_flusharray+0x65/0x89
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<c0171460>] kmem_cache_free+0x36/0x4f
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<c0158bf1>] mempool_free+0x63/0x67
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<f89deb87>] pending_complete+0xdd/0x138 [dm_snapshot]
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<f89df873>] persistent_commit+0xdc/0xf0 [dm_snapshot]
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<f89dec0a>] copy_callback+0x28/0x2c [dm_snapshot]
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<f8b9feed>] run_complete_job+0x46/0x6f [dm_mod]
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<f89debe2>] copy_callback+0x0/0x2c [dm_snapshot]
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<f8b9fd19>] process_jobs+0x1f/0xaa [dm_mod]
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<f8b9fea7>] run_complete_job+0x0/0x6f [dm_mod]
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<f8b9fda4>] do_work+0x0/0x38 [dm_mod]
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<f8b9fdba>] do_work+0x16/0x38 [dm_mod]
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<c012efae>] run_workqueue+0x74/0xf2
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<c012f689>] worker_thread+0x0/0xbd
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<c012f73c>] worker_thread+0xb3/0xbd
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<c013194c>] autoremove_wake_function+0x0/0x2d
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<c013188b>] kthread+0x38/0x5d
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<c0131853>] kthread+0x0/0x5d
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] [<c01044f3>] kernel_thread_helper+0x7/0x10
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] =======================
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] Code: 54 24 0c 8b 0c 82 8d 91 00 00 00 40 c1 ea 0c c1 e2 05 03 15 e0 04 41 c0 8b 02 25 00 40 00 00 85 c0 7
4 03 8b 52 0c 80 3a 00 78 04 <0f> 0b eb fe 8b 72 1c 8b 44 24 28 8b 16 8b 7c 85 68 8b 46 04 89
Jul 29 16:23:42 cifs2 kernel: [2369743.510965] EIP: [<c0171556>] free_block+0x4e/0xf4 SS:ESP 0068:c489deb0
The following related packages are installed:
ii drbd8-modules-2.6.26-2-686 2.6.26+8.0.14-6+lenny1
ii drbd8-utils 2:8.0.14-2
ii lvm2 2.02.39-7
ii linux-image-2.6.26-2-686 2.6.26-17
ii samba 2:3.2.5-4lenny6
ii samba-common 2:3.2.5-4lenny6
/etc/drbd.conf
global {
usage-count yes;
}
common {
protocol C;
}
resource r0 {
disk {
on-io-error detach;
}
net {
timeout 100;
connect-int 15;
ping-int 15;
ping-timeout 20;
}
syncer {
rate 30M;
}
on cifs1 {
device /dev/drbd0;
disk /dev/mapper/vg-data;
address 192.168.253.1:7789;
meta-disk /dev/mapper/vg-meta[0];
}
on cifs2 {
device /dev/drbd0;
disk /dev/mapper/vg-data;
address 192.168.253.2:7789;
meta-disk /dev/mapper/vg-meta[0];
}
}
lvdisplay:
--- Logical volume ---
LV Name /dev/vg/data
VG Name vg
LV UUID 1XZ39O-0NKd-cNa5-k9fF-I9Qr-OZSz-0h3I27
LV Write Access read/write
LV Status available
# open 1
LV Size 2.44 TB
Current LE 640000
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 254:0
--- Logical volume ---
LV Name /dev/vg/meta
VG Name vg
LV UUID ncPmW1-tL2y-DaIx-QExn-dmFj-FZX5-3oqEH5
LV Write Access read/write
LV Status available
# open 1
LV Size 512.00 MB
Current LE 128
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 254:1
Reply to: