[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#461407: marked as done (kernel-image: Kernel panic with clusterip)



Your message dated Sun, 19 Jul 2009 21:15:56 +0200
with message-id <20090719191556.GA21910@galadriel.inutil.org>
and subject line Re: kernel-image: Kernel panic with clusterip
has caused the Debian Bug report #461407,
regarding kernel-image: Kernel panic with clusterip
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
461407: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=461407
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: kernel-image
Version: 2.6-amd64
Severity: normal

Using the clusterip feature of the linux kernel often provoke a kernel panic
when the module ipt_CLUSTERIP is removed.
This behaviour is non constan but it happens frequently.
The passages to reproduce this bug are:
1. setup two debian machines on the same subnet
2. choose a new (and unused) ip of the same subnet.
3. on the machine A and B type:
   ip add $NEW_IP dev $IFACE
4. on the machine A type:
   iptables -I INPUT -d $NEW_IP -i $IFACE -j CLUSTERIP --new --hashmode sourceip-sourceport --clustermac 01:02:03:04:05:06 --total-nodes 2 --local-node 1
   on the machine B type:
   iptables -I INPUT -d $NEW_IP -i $IFACE -j CLUSTERIP --new --hashmode sourceip-sourceport --clustermac 01:02:03:04:05:06 --total-nodes 2 --local-node 2
5. ping form a third machine the $NEW_IP and see if it's working.
   If not reread the previous passages and detect the error.
6. on the machine A type:
   iptables -D INPUT iptables -I INPUT -d $NEW_IP -i $IFACE -j CLUSTERIP --new --hashmode sourceip-sourceport --clustermac 01:02:03:04:05:06 --total-nodes 2 --local-node 1
   on the machine B type:
   echo "+1" > /proc/net/ipt_CLUSTERIP/$NEW_IP
7. At this point the machine A should show a kernel panic.
   if not type on the machine B:
   echo "-1" > /proc/net/ipt_CLUSTERIP/$NEW_IP
   and on the machine A:
   iptables -I INPUT -d $NEW_IP -i $IFACE -j CLUSTERIP --new --hashmode sourceip-sourceport --clustermac 01:02:03:04:05:06 --total-nodes 2 --local-node 1
   then return at step 6.

This is the log from first kernel panic I've got:
NMI Watchdog detected LOCKUP on CPU 0
CPU 0
Modules linked in: ipv6 button ac battery xt_tcpudp xt_state ip_conntrack nfnetlink ipt_CLUSTERIP xt_multiport iptable_filter ip_tables x_tables dm_snapshot dm_mirror dm_mod loop i2c_amd756 i2c_core psmouse floppy amd_rng serio_raw pcspkr shpchp pci_hotplug evdev ext3 jbd mbcache sd_mod ide_cd cdrom generic mptspi mptscsih mptbase scsi_transport_spi scsi_mod tg3 amd74xx ohci_hcd ide_core thermal processor fan
Pid: 1650, comm: df_inode Not tainted 2.6.18-5-amd64 #1
RIP: 0010:[<ffffffff8025716f>]  [<ffffffff8025716f>] cache_alloc_refill+0x14e/0x1da
RSP: 0018:ffff810073b4bcf8  EFLAGS: 00000017
RAX: ffff81000179b000 RBX: 000000000000000a RCX: 0000000000000008
RDX: ffff81000179b000 RSI: ffff810001773000 RDI: ffff810037b243c0
RBP: ffff810001773000 R08: ffff810037b03400 R09: ffff810037b06000
R10: ffffffff8024bd1e R11: 0000000000000000 R12: ffff810037b20dc0
R13: ffff810037b03400 R14: 0000000000000032 R15: ffff810037b243c0
FS:  00002b4ffe8176d0(0000) GS:ffffffff80521000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000005cae30 CR3: 0000000073adf000 CR4: 00000000000006e0
Process df_inode (pid: 1650, threadinfo ffff810073b4a000, task ffff81007df5a080)
Stack:  000000d07d9ef900 ffff810037b243c0 0000000000000286 00000000000000d0
 ffff810073b4be14 ffff81007e2ab9c0 ffff810037de0980 ffffffff802b5b71
 ffff81007c47a2c0 0000000000000400 00000000000000ff ffffffff8022e7b5
Call Trace:
 [<ffffffff802b5b71>] __kmalloc+0x8a/0x94
 [<ffffffff8022e7b5>] expand_files+0xc8/0x2b1
 [<ffffffff802205c1>] dup_fd+0x13b/0x287
 [<ffffffff80264024>] do_gettimeofday+0x50/0x94
 [<ffffffff80245186>] copy_files+0x47/0x63
 [<ffffffff8021d0ad>] copy_process+0x50f/0x1490
 [<ffffffff8022f102>] do_fork+0xcd/0x1d0
 [<ffffffff80257bd6>] system_call+0x7e/0x83
 [<ffffffff80257ee3>] ptregscall_common+0x67/0xac


Code: 4c 89 65 08 49 89 2c 24 45 85 f6 0f 8f 42 ff ff ff 41 8b 45
console shuts up ...
 <4>get_unused_fd: slot 0 not NULL!
NMI Watchdog detected LOCKUP on CPU 1
CPU 1
Modules linked in: ipv6 button ac battery xt_tcpudp xt_state ip_conntrack nfnetlink ipt_CLUSTERIP xt_multiport iptable_filter ip_tables x_tables dm_snapshot dm_mirror dm_mod loop i2c_amd756 i2c_core psmouse floppy amd_rng serio_raw pcspkr shpchp pci_hotplug evdev ext3 jbd mbcache sd_mod ide_cd cdrom generic mptspi mptscsih mptbase scsi_transport_spi scsi_mod tg3 amd74xx ohci_hcd ide_core thermal processor fan
Pid: 0, comm: swapper Not tainted 2.6.18-5-amd64 #1
RIP: 0010:[<ffffffff8025dff6>]  [<ffffffff8025dff6>] .text.lock.spinlock+0x2/0x8a
RSP: 0018:ffff81000164fe78  EFLAGS: 00000082
RAX: 0000000000000000 RBX: ffff810080012cc0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff810080012cc0 RDI: ffff810037b20e00
RBP: ffff810037b20dc0 R08: ffff81000164fec0 R09: ffff81000164fec0
R10: ffff81000164ff30 R11: 00000000ffffffff R12: 0000000000000000
R13: ffff810037b243c0 R14: 0000000000000282 R15: 0000000000000000
FS:  00002b2ccb19cc80(0000) GS:ffff8100800833c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002b2ccaff2160 CR3: 0000000000201000 CR4: 00000000000006e0
Process swapper (pid: 0, threadinfo ffff81008008c000, task ffff810001636830)
Stack:  ffffffff802b629d 00000002ffffffff ffff810080012cc0 ffff81007ce79240
 0000000000000000 ffff810037b243c0 ffffffff8020ae3d 0000000000000000
 ffff81007ce79240 ffff81007c47a8a0 ffff81007c47a880 0000000000000000
Call Trace:
 <IRQ> [<ffffffff802b629d>] __drain_alien_cache+0x2c/0x66
 [<ffffffff8020ae3d>] kfree+0x12c/0x1bc
 [<ffffffff802c2e67>] free_fdtable_rcu+0x75/0xe5
 [<ffffffff8028db45>] __rcu_process_callbacks+0x122/0x1a8
 [<ffffffff8028dbee>] rcu_process_callbacks+0x23/0x43
 [<ffffffff80283db8>] tasklet_action+0x62/0xac
 [<ffffffff80210381>] __do_softirq+0x5e/0xd5
 [<ffffffff80258dac>] call_softirq+0x1c/0x28
 [<ffffffff80263749>] do_softirq+0x2c/0x7d
 [<ffffffff802617fd>] default_idle+0x0/0x50
 [<ffffffff8025874a>] apic_timer_interrupt+0x66/0x6c
 <EOI> [<ffffffff8026ea49>] physflat_send_IPI_mask+0x0/0x6a
 [<ffffffff80261826>] default_idle+0x29/0x50
 [<ffffffff8024508b>] cpu_idle+0x95/0xb8
 [<ffffffff8026c0f1>] start_secondary+0x43e/0x44d


Code: 83 3f 00 7e f9 e9 6d fe ff ff e8 ff d7 ff ff e9 7d fe ff ff
console shuts up ...
 <0>Kernel panic - not syncing: Aiee, killing interrupt handler!


hash=1 ct_hash=1 <4>warning: many lost ticks.
Your time source seems to be instable or some driver is hogging interupts
rip vprintk+0x29e/0x2ea
responsible


And this is the second one:
Unable to handle kernel NULL pointer dereference at 0000000000000007 RIP:
 [<ffffffff8819c011>] :ipt_CLUSTERIP:__clusterip_config_find+0x11/0x22
PGD 7d1fe067 PUD 7d1ff067 PMD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in: ipv6 button ac battery xt_tcpudp xt_state ip_conntrack nfnetlink ipt_CLUSTERIP xt_multiport iptable_filter ip_tables x_tables dm_snapshot dm_mirror dm_mod loop evdev psmouse shpchp serio_raw i2c_amd756 pcspkr i2c_core floppy pci_hotplug amd_rng ext3 jbd mbcache ide_cd cdrom generic sd_mod amd74xx ide_core mptspi mptscsih ohci_hcd mptbase scsi_transport_spi scsi_mod tg3 thermal processor fan
Pid: 0, comm: swapper Not tainted 2.6.18-5-amd64 #1
RIP: 0010:[<ffffffff8819c011>]  [<ffffffff8819c011>] :ipt_CLUSTERIP:__clusterip_config_find+0x11/0x22
RSP: 0018:ffffffff804c0db8  EFLAGS: 00010293
RAX: 0000000000000007 RBX: 0000000018016e9e RCX: ffff810037f2c000
RDX: 0000000000000007 RSI: ffffffff804c0ea0 RDI: 0000000018016e9e
RBP: ffff81007dcc0a10 R08: ffffffff8022d7d5 R09: ffffffff8819e080
R10: ffff810037f2c1a8 R11: 00000000ffffffff R12: ffff81007dcc0a18
R13: ffff810037f2c000 R14: ffffffff804c0ea0 R15: ffffffff80513190
FS:  00002b56d29bc6d0(0000) GS:ffffffff80521000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000007 CR3: 000000007d1fd000 CR4: 00000000000006e0
Process swapper (pid: 0, threadinfo ffffffff80530000, task ffffffff804494c0)
Stack:  ffffffff8819c62c 0000000000000082 ffffffff804c0e50 ffff810037f2c000
 0000000000000000 0000000000000001 ffffffff80231d56 0000000000000080
 0000000000000001 ffffffff804c0ea0 ffffffff80513190 ffffffff8022d7d5
Call Trace:
 <IRQ> [<ffffffff8819c62c>] :ipt_CLUSTERIP:arp_mangle+0x63/0xd3
 [<ffffffff80231d56>] nf_iterate+0x41/0x7d
 [<ffffffff8022d7d5>] dev_queue_xmit+0x0/0x25c
 [<ffffffff802523bf>] nf_hook_slow+0x58/0xc4
 [<ffffffff8022d7d5>] dev_queue_xmit+0x0/0x25c
 [<ffffffff803c0c83>] arp_xmit+0x3d/0x4f
 [<ffffffff803c1fba>] arp_solicit+0x129/0x183
 [<ffffffff80399ca1>] neigh_timer_handler+0x2ab/0x2fc
 [<ffffffff803999f6>] neigh_timer_handler+0x0/0x2fc
 [<ffffffff80287107>] run_timer_softirq+0x133/0x1b1
 [<ffffffff80210381>] __do_softirq+0x5e/0xd5
 [<ffffffff80258dac>] call_softirq+0x1c/0x28
 [<ffffffff80263749>] do_softirq+0x2c/0x7d
 [<ffffffff802617fd>] default_idle+0x0/0x50
 [<ffffffff8025874a>] apic_timer_interrupt+0x66/0x6c
 <EOI> [<ffffffff80261826>] default_idle+0x29/0x50
 [<ffffffff8024508b>] cpu_idle+0x95/0xb8
 [<ffffffff8053a799>] start_kernel+0x216/0x21b
 [<ffffffff8053a288>] _sinittext+0x288/0x28c


Code: 48 8b 10 0f 18 0a 48 3d 40 e1 19 88 75 ea 31 c0 c3 48 89 f7
RIP  [<ffffffff8819c011>] :ipt_CLUSTERIP:__clusterip_config_find+0x11/0x22
 RSP <ffffffff804c0db8>
CR2: 0000000000000007
 <0>Kernel panic - not syncing: Aiee, killing interrupt handler!
 
-- System Information:
Debian Release: 4.0
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.18-5-amd64
Locale: LANG=it_IT.UTF-8, LC_CTYPE=it_IT.UTF-8 (charmap=UTF-8)



--- End Message ---
--- Begin Message ---
On Sat, Dec 27, 2008 at 02:10:44PM +0100, Moritz Muehlenhoff wrote:
> On Fri, Jan 18, 2008 at 12:50:08PM +0100, Michele Codutti wrote:
> > Package: kernel-image
> > Version: 2.6-amd64
> > Severity: normal
> > 
> > Using the clusterip feature of the linux kernel often provoke a kernel panic
> > when the module ipt_CLUSTERIP is removed.
> > This behaviour is non constan but it happens frequently.
> 
> Does this error still occur with more recent kernel versions?
> 
> If you're running Etch, could you try to reproduce this bug
> with the 2.6.24 based kernel added in 4.0r4?
> http://packages.qa.debian.org/l/linux-2.6.24.html

No further feedback, closing the bug.

If anyone reencounters the problem, please reopen this bug.

Cheers,
        Moritz


--- End Message ---

Reply to: