[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#616726: marked as done (ocfs2-tools: Kernel bug seems to occur on ocfs2+drbd in pri-pri)



Your message dated Tue, 21 Feb 2012 04:24:43 -0600
with message-id <20120221102443.GA28089@burratino>
and subject line Re: [squeeze] Kernel bug seems to occur on ocfs2+drbd in pri-pri
has caused the Debian Bug report #616726,
regarding ocfs2-tools: Kernel bug seems to occur on ocfs2+drbd in pri-pri
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
616726: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=616726
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: ocfs2-tools
Version: 1.4.4-3
Severity: normal

Hi there,

Not sure if this is the correct package to report it against, but we've been seeing a lot of these messages lately:

Mar  6 22:31:50 wp1 kernel: [382123.062506] BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
Mar  6 22:31:50 wp1 kernel: [382123.063396] IP: [<ffffffff810e7c50>] __kmalloc+0xd2/0x141
Mar  6 22:31:50 wp1 kernel: [382123.063833] PGD 0 
Mar  6 22:31:50 wp1 kernel: [382123.064281] Oops: 0000 [#6] SMP 
Mar  6 22:31:50 wp1 kernel: [382123.064516] last sysfs file: /sys/module/drbd/parameters/cn_idx
Mar  6 22:31:50 wp1 kernel: [382123.064516] CPU 1 
Mar  6 22:31:50 wp1 kernel: [382123.064516] Modules linked in: ocfs2 jbd2 quota_tree drbd lru_cache cn ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipt_REJECT xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables loop i2c_piix4 snd_pcm i2c_core snd_timer snd soundcore evdev button snd_page_alloc psmouse pcspkr serio_raw processor virtio_balloon ext3 jbd mbcache dm_mod ata_generic uhci_hcd ehci_hcd ata_piix libata usbcore virtio_blk floppy virtio_net nls_base scsi_mod virtio_pci virtio_ring virtio thermal thermal_sys [last unloaded: scsi_wait_scan]
Mar  6 22:31:50 wp1 kernel: [382123.064516] Pid: 11487, comm: sshd Tainted: G      D    2.6.32-5-amd64 #1 Bochs
Mar  6 22:31:50 wp1 kernel: [382123.064516] RIP: 0010:[<ffffffff810e7c50>]  [<ffffffff810e7c50>] __kmalloc+0xd2/0x141
Mar  6 22:31:50 wp1 kernel: [382123.064516] RSP: 0018:ffff88021dd2fd88  EFLAGS: 00010002
Mar  6 22:31:50 wp1 kernel: [382123.064516] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000010
Mar  6 22:31:50 wp1 kernel: [382123.064516] RDX: ffff880008c92070 RSI: 0000000000008020 RDI: ffffffff81269a88
Mar  6 22:31:50 wp1 kernel: [382123.064516] RBP: 0000000000000046 R08: 0000000074f4f75f R09: 00007f7636f24d31
Mar  6 22:31:50 wp1 kernel: [382123.064516] R10: 0000000000000000 R11: ffffffff81152d1d R12: ffffffff81455200
Mar  6 22:31:50 wp1 kernel: [382123.064516] R13: 0000000000008020 R14: 0000000000008020 R15: 0000000000000010
Mar  6 22:31:50 wp1 kernel: [382123.064516] FS:  00007f7636eb77c0(0000) GS:ffff880008c80000(0000) knlGS:0000000000000000
Mar  6 22:31:50 wp1 kernel: [382123.064516] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar  6 22:31:50 wp1 kernel: [382123.064516] CR2: 0000000000000002 CR3: 000000021b78f000 CR4: 00000000000006e0
Mar  6 22:31:50 wp1 kernel: [382123.064516] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar  6 22:31:50 wp1 kernel: [382123.064516] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Mar  6 22:31:50 wp1 kernel: [382123.064516] Process sshd (pid: 11487, threadinfo ffff88021dd2e000, task ffff88021cf1cdb0)
Mar  6 22:31:50 wp1 kernel: [382123.064516] Stack:
Mar  6 22:31:50 wp1 kernel: [382123.064516]  ffff88021eef9098 ffffffff81269a88 000000101e1bb7a0 0000000000000001
Mar  6 22:31:50 wp1 kernel: [382123.064516] <0> ffff88021fc0b000 0000000000000001 0000000000000001 0000000000000010
Mar  6 22:31:50 wp1 kernel: [382123.064516] <0> 0000000000002cdf ffffffff81269a88 0000000000000000 0000000000000000
Mar  6 22:31:50 wp1 kernel: [382123.064516] Call Trace:
Mar  6 22:31:50 wp1 kernel: [382123.064516]  [<ffffffff81269a88>] ? nl_pid_hash_rehash+0x49/0xf1
Mar  6 22:31:50 wp1 kernel: [382123.064516]  [<ffffffff81269a88>] ? nl_pid_hash_rehash+0x49/0xf1
Mar  6 22:31:50 wp1 kernel: [382123.064516]  [<ffffffff8126a513>] ? netlink_insert+0xbc/0x123
Mar  6 22:31:50 wp1 kernel: [382123.064516]  [<ffffffff8126a619>] ? netlink_autobind+0x9f/0xbc
Mar  6 22:31:50 wp1 kernel: [382123.064516]  [<ffffffff8126ab94>] ? netlink_bind+0x82/0x179
Mar  6 22:31:50 wp1 kernel: [382123.064516]  [<ffffffff812419c7>] ? sys_bind+0x7a/0xb9
Mar  6 22:31:50 wp1 kernel: [382123.064516]  [<ffffffff812fe0b6>] ? do_page_fault+0x2e0/0x2fc
Mar  6 22:31:50 wp1 kernel: [382123.064516]  [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b
Mar  6 22:31:50 wp1 kernel: [382123.064516] Code: fa 66 0f 1f 44 00 00 65 8b 04 25 a8 e3 00 00 48 98 49 8b 94 c4 f0 02 00 00 8b 4a 18 89 4c 24 14 48 8b 1a 48 85 db 74 0c 8b 42 14 <48> 8b 04 c3 48 89 02 eb 19 48 8b 4c 24 08 49 89 d0 44 89 ee 83 
Mar  6 22:31:50 wp1 kernel: [382123.064516] RIP  [<ffffffff810e7c50>] __kmalloc+0xd2/0x141
Mar  6 22:31:50 wp1 kernel: [382123.064516]  RSP <ffff88021dd2fd88>
Mar  6 22:31:50 wp1 kernel: [382123.064516] CR2: 0000000000000002
Mar  6 22:31:50 wp1 kernel: [382123.064516] ---[ end trace cc6f13eaca45c0e5 ]---

About 10 minutes later, the machine become unresponsive.

We're running a Debian Squeeze guest within KVM, with a drbd to another Debian 
Squeeze machine. The drbd is setup in primary-primary mode. It doesn't happen
very often, about once every two weeks, which makes it hard to troubleshoot.

Do let me know if you need any more information!

-- 
Kind regards,
Tim Stoop


-- System Information:
Debian Release: 6.0
  APT prefers squeeze-updates
  APT policy: (500, 'squeeze-updates'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.32-5-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages ocfs2-tools depends on:
ii  debconf [debconf-2.0]     1.5.36.1       Debian configuration management sy
ii  libc6                     2.11.2-10      Embedded GNU C Library: Shared lib
ii  libcomerr2                1.41.12-2      common error description library
ii  libglib2.0-0              2.24.2-1       The GLib library of C routines
ii  libncurses5               5.7+20100313-5 shared libraries for terminal hand
ii  libreadline6              6.1-3          GNU readline and history libraries
ii  libuuid1                  2.17.2-9       Universally Unique ID library
ii  psmisc                    22.11-1        utilities that use the proc file s

ocfs2-tools recommends no packages.

Versions of packages ocfs2-tools suggests:
pn  ocfs2-tools-cman              <none>     (no description available)
pn  ocfs2-tools-pacemaker         <none>     (no description available)
ii  ocfs2console                  1.4.4-3    tools for managing OCFS2 cluster f

-- debconf information:
  ocfs2-tools/heartbeat_threshold: 31
  ocfs2-tools/reconnect_delay: 2000
  ocfs2-tools/init: true
  ocfs2-tools/keepalive_delay: 2000
  ocfs2-tools/clustername: www
  ocfs2-tools/idle_timeout: 30000



--- End Message ---
--- Begin Message ---
Version: 2.6.32-41
tags 616726 - unreproducible
quit

Tim Stoop wrote:

> We're currently using the linux-image-2.6.32-5-amd64 package
> (2.6.32-41) and we haven't seen the problem since. So it looks like
> it's solved.

Thanks, both.  Marking accordingly.


--- End Message ---

Reply to: