[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Server system fail - why?



Hello all.

What we have:
2x Dell 2950 with Debian 5.0 x64
Kernel: 2.6.32-bpo.5-amd64 from backports
Soft: DRBD + OCFS2

I have two nodes testing with DRBD + OCFS2 on them. All seems fine but this night both node reboot and one freeze. Nothing was working on them. No stress test of any kind. They just was idle.

Could some one guide me were to search problem?
Last logs below.


Last thing what was on ssh console was:

Message from syslogd@mail01 at Aug 31 02:06:37 ...
 kernel:[43263.442871] ------------[ cut here ]------------

Message from syslogd@mail01 at Aug 31 02:06:37 ...
 kernel:[43263.442946] invalid opcode: 0000 [#1] SMP

Message from syslogd@mail01 at Aug 31 02:06:37 ...
 kernel:[43263.442973] last sysfs file: /sys/fs/o2cb/interface_revision

Message from syslogd@mail01 at Aug 31 02:06:37 ...
 kernel:[43263.443831] Stack:

Message from syslogd@mail01 at Aug 31 02:06:37 ...
 kernel:[43263.444002] Call Trace:

Message from syslogd@mail01 at Aug 31 02:06:37 ...
kernel:[43263.444244] Code: 83 c3 08 48 83 3b 00 eb ec 48 83 fd 10 0f 86 89 00 00 00 48 89 ef e8 b9 e8 ff ff 48 89 c7 48 8b 00 84 c0 78 13 66 a9 00 c0 75 04 <0f> 0b eb fe 5b 5d 41 5c e9 54 59 fd ff 48 8b 4c 24 18 4c 8b 4f



Last thing in var/log/messages:
Aug 30 14:26:01 mail01 kernel: [ 1227.315451] ocfs2_dlm: Node 1 joins domain 9A96A6832198449A9C8329D2E0C4ED7B Aug 30 14:26:01 mail01 kernel: [ 1227.315527] ocfs2_dlm: Nodes in domain ("9A96A6832198449A9C8329D2E0C4ED7B"): 0 1

*** HERE IS PROMLEM STARTED ***

Aug 31 02:06:37 mail01 kernel: [43263.442999] CPU 1
Aug 31 02:06:37 mail01 kernel: [43263.443021] Modules linked in: drbd ocfs2 jbd2 quota_tree sha1_generic hmac lru_cache cn xt_multiport ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglu e configfs nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables ext2 loop i5k_amb snd_pcm snd_timer dcdbas snd soundcore evdev i5000_edac serio_raw snd_page_alloc psmouse edac_core pcspkr rng_core processor button shpchp pci_hotplug ext3 jbd mbcache sg sr_mod cdrom sd_mod ses crc_t10dif enclosure ata_generic ata_piix ehci_hcd uhci_hcd megaraid_sas libata scsi_mod usbcore nls_
base bnx2 thermal fan thermal_sys [last unloaded: drbd]
Aug 31 02:06:37 mail01 kernel: [43263.443362] Pid: 2011, comm: slapd Not tainted 2.6.32-bpo.5-amd64 #1 PowerEdge 2950 Aug 31 02:06:37 mail01 kernel: [43263.443406] RIP: 0010:[<ffffffff810e55eb>] [<ffffffff810e55eb>] kfree+0x55/0xcb Aug 31 02:06:37 mail01 kernel: [43263.443456] RSP: 0018:ffff88012ca85db8 EFLAGS: 00010046 Aug 31 02:06:37 mail01 kernel: [43263.443482] RAX: 0200000000080000 RBX: 0000000000000000 RCX: 0000000068a4cfe9 Aug 31 02:06:37 mail01 kernel: [43263.443511] RDX: ffff88012fc13000 RSI: 0000000000000010 RDI: ffffea0003800000 Aug 31 02:06:37 mail01 kernel: [43263.443540] RBP: ffff880100000001 R08: 0000000072b1d310 R09: 00000000d002ea4e Aug 31 02:06:37 mail01 kernel: [43263.443569] R10: 000000008a6becc7 R11: 0000000072d84f77 R12: ffffffff812673ba Aug 31 02:06:37 mail01 kernel: [43263.443598] R13: 0000000000000001 R14: 0000000000000010 R15: 0000000000000000 Aug 31 02:06:37 mail01 kernel: [43263.443628] FS: 000000004194a950(0063) GS:ffff880005440000(0000) knlGS:0000000000000000 Aug 31 02:06:37 mail01 kernel: [43263.443672] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Aug 31 02:06:37 mail01 kernel: [43263.443699] CR2: 00007f2b6dffd000 CR3: 000000012dfaa000 CR4: 00000000000006e0 Aug 31 02:06:37 mail01 kernel: [43263.443728] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Aug 31 02:06:37 mail01 kernel: [43263.443757] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Aug 31 02:06:37 mail01 kernel: [43263.443786] Process slapd (pid: 2011, threadinfo ffff88012ca84000, task ffff88012e421530) Aug 31 02:06:37 mail01 kernel: [43263.443850] 0000000000000000 ffff88012fc13000 0000000000000002 ffffffff812673ba Aug 31 02:06:37 mail01 kernel: [43263.443884] <0> ffff880100000001 0000000000000000 ffff88012fc13000 ffff88010efb5800 Aug 31 02:06:37 mail01 kernel: [43263.443935] <0> 0000000000000001 ffff880100000001 00000000000007d5 ffffffff81267dc4 Aug 31 02:06:37 mail01 kernel: [43263.444026] [<ffffffff812673ba>] ? nl_pid_hash_rehash+0xca/0xf1 Aug 31 02:06:37 mail01 kernel: [43263.444053] [<ffffffff81267dc4>] ? netlink_insert+0xbc/0x123 Aug 31 02:06:37 mail01 kernel: [43263.444081] [<ffffffff81267eca>] ? netlink_autobind+0x9f/0xbc Aug 31 02:06:37 mail01 kernel: [43263.444108] [<ffffffff81268445>] ? netlink_bind+0x82/0x179 Aug 31 02:06:37 mail01 kernel: [43263.444136] [<ffffffff8123f2a9>] ? sys_bind+0x7a/0xb9 Aug 31 02:06:37 mail01 kernel: [43263.444162] [<ffffffff810eb2f3>] ? fd_install+0x2e/0x5a Aug 31 02:06:37 mail01 kernel: [43263.444188] [<ffffffff8123e2a8>] ? sock_map_fd+0x57/0x64 Aug 31 02:06:37 mail01 kernel: [43263.444217] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b
Aug 31 02:06:37 mail01 kernel: [43263.444460]  RSP <ffff88012ca85db8>
Aug 31 02:06:37 mail01 kernel: [43263.444821] ---[ end trace 381ebef00a1cbadb ]---

*** I REBOOT SERVER ***

Aug 31 08:47:07 mail01 kernel: imklog 3.18.6, log source = /proc/kmsg started. Aug 31 08:47:07 mail01 rsyslogd: [origin software="rsyslogd" swVersion="3.18.6" x-pid="1987" x-info="http://www.rsyslog.com";] restart Aug 31 08:47:07 mail01 kernel: [ 0.000000] Initializing cgroup subsys cpuset
Aug 31 08:47:07 mail01 kernel: [    0.000000] Initializing cgroup subsys cpu



--
Best regards,
Proskurin Kirill


Reply to: