[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#616726: [squeeze] Kernel bug seems to occur on ocfs2+drbd in pri-pri



severity 616726 important
merge 598323 616726
quit

Hi Tim,

Tim Stoop wrote:

> Not sure if this is the correct package to report it against, but
> we've been seeing a lot of these messages lately:
[...]
> About 10 minutes later, the machine become unresponsive.
>
> We're running a Debian Squeeze guest within KVM, with a drbd to another Debian 
> Squeeze machine. The drbd is setup in primary-primary mode. It doesn't happen
> very often, about once every two weeks, which makes it hard to troubleshoot.
[...]
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
> IP: [<ffffffff810e7c50>] __kmalloc+0xd2/0x141
> PGD 0 
> Oops: 0000 [#6] SMP 
> last sysfs file: /sys/module/drbd/parameters/cn_idx
> CPU 1 
> Modules linked in: ocfs2 jbd2 quota_tree drbd lru_cache cn ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipt_REJECT xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables loop i2c_piix4 snd_pcm i2c_core snd_timer snd soundcore evdev button snd_page_alloc psmouse pcspkr serio_raw processor virtio_balloon ext3 jbd mbcache dm_mod ata_generic uhci_hcd ehci_hcd ata_piix libata usbcore virtio_blk floppy virtio_net nls_base scsi_mod virtio_pci virtio_ring virtio thermal thermal_sys [last unloaded: scsi_wait_scan]
> Pid: 11487, comm: sshd Tainted: G      D    2.6.32-5-amd64 #1 Bochs
[...]
> Call Trace:
>  [<ffffffff81269a88>] ? nl_pid_hash_rehash+0x49/0xf1
>  [<ffffffff81269a88>] ? nl_pid_hash_rehash+0x49/0xf1
>  [<ffffffff8126a513>] ? netlink_insert+0xbc/0x123
>  [<ffffffff8126a619>] ? netlink_autobind+0x9f/0xbc
>  [<ffffffff8126ab94>] ? netlink_bind+0x82/0x179
>  [<ffffffff812419c7>] ? sys_bind+0x7a/0xb9
[...]
> Code: fa 66 0f 1f 44 00 00 65 8b 04 25 a8 e3 00 00 48 98 49 8b 94 c4 f0 02 00 00 8b 4a 18 89 4c 24 14 48 8b 1a 48 85 db 74 0c 8b 42 14 <48> 8b 04 c3 48 89 02 eb 19 48 8b 4c 24 08 49 89 d0 44 89 ee 83 
> RIP  [<ffffffff810e7c50>] __kmalloc+0xd2/0x141

Thanks for reporting it.  The 'D' indicates that this is not the first
oops or warning.  Do you happen to have the first oops from a boot
during which this bug occured?

I'd also be interested to hear which kernel versions you've tried, and
whether current (squeeze or sid) kernels behave any better.

Thanks for writing, and sorry for the slow reply.

Sincerely,
Jonathan



Reply to: