Bug#467508: kernel oops with screen and networked application
Package: linux-image-2.6.18-6-486
Version: 2.6.18.dfsg.1-18etch1
Severity: critical
Justification: breaks the whole system
Prerequisites: Running a so called screen under root user simultane
with a screen session under a non-priviliged user containing a
networked
application that has a reasonable quantity of open connections will
cause a kernel oops on all 2.6.18-6, 2.6.22-x, 2.6.23-x but not 2.6.24-1
on
as far as i know x86 architecture. Changing from the i686 flavor to i486
flavor
has no influence.
Testcondition: Run btlaunchmanycurses within a non-priviliged user
screen session along with a root users screen session with enough
connections handled by btlaunchmanycurses.
dump:
Feb 24 15:47:00 purple kernel: BUG: unable to handle kernel paging
request at virtual address 8000086c
Feb 24 15:47:00 purple kernel: printing eip:
Feb 24 15:47:00 purple kernel: c0130a9e
Feb 24 15:47:00 purple kernel: *pde = 00000000
Feb 24 15:47:00 purple kernel: Oops: 0000 [#1]
Feb 24 15:47:01 purple kernel: SMP
Feb 24 15:47:01 purple kernel: Modules linked in: ipv6 button ac battery
dm_snapshot dm_mirror dm_mod loop snd_intel8x0 snd_ac97_codec
snd_ac97_bus snd_pcm snd_timer snd snd_page_alloc tsdev parport_pc
parport analog i810_audio ac97_codec floppy gameport serio_raw
i2c_sis96x rtc soundcore shpchp pci_hotplug pcspkr psmouse i2c_core
sis_agp agpgart evdev ext3 jbd mbcache ide_disk ehci_hcd ohci_hcd sis900
mii usbcore sis5513 generic ide_core thermal processor fan
Feb 24 15:47:01 purple kernel: CPU: 0
Feb 24 15:47:01 purple kernel: EIP: 0060:[<c0130a9e>] Not tainted
VLI
Feb 24 15:47:01 purple kernel: EFLAGS: 00210282 (2.6.18-6-686 #1)
Feb 24 15:47:01 purple kernel: EIP is at futex_wake+0x88/0xb3
Feb 24 15:47:01 purple kernel: eax: c11f2b98 ebx: c0364cfc ecx:
c1364d18 edx: 8000086c
Feb 24 15:47:01 purple kernel: esi: 8000086c edi: 00000000 ebp:
c0364d00 esp: d46fbea4
Feb 24 15:47:01 purple kernel: ds: 007b es: 007b ss: 0068
Feb 24 15:47:01 purple kernel: Process btlaunchmanycur (pid: 5937,
ti=d46fa000 task=dbf9faa0 task.ti=d46fa000)
Feb 24 15:47:01 purple kernel: Stack: 00000001 0835e000 db4c9040
00000cf8 0835ecf8 00000000 00000000 ffffffda
Feb 24 15:47:01 purple kernel: c013168e 00000000 dbf9faa0
c012d92d 00000001 0835ecf8 c0290220 c66f4300
Feb 24 15:47:01 purple kernel: d46fbef4 00000010 c0220a1f
c0220a39 50000002 9ab0f74d 00000000 00000000
Feb 24 15:47:01 purple kernel: Call Trace:
Feb 24 15:47:01 purple kernel: [<c013168e>] do_futex+0x20d/0xab7
Feb 24 15:47:01 purple kernel: [<c012d92d>] autoremove_wake_function
+0x0/0x2d
Feb 24 15:47:01 purple kernel: [<c0220a1f>] sys_connect+0x7d/0xa9
Feb 24 15:47:01 purple kernel: [<c0220a39>] sys_connect+0x97/0xa9
Feb 24 15:47:01 purple kernel: [<c017c0fe>] inotify_d_instantiate
+0x36/0x59
Feb 24 15:47:01 purple kernel: [<c016d342>] d_rehash+0x52/0x62
Feb 24 15:47:01 purple kernel: [<c0221655>] sys_recv+0x19/0x1d
Feb 24 15:47:01 purple kernel: [<c0132014>] sys_futex+0xdc/0xf1
Feb 24 15:47:01 purple kernel: [<c0102c11>] sysenter_past_esp+0x56/0x79
Feb 24 15:47:01 purple kernel: Code: 3b 44 24 08 75 23 8b 41 08 3b 44 24
0c 75 1a 83 7a 2c 00 74 07 bf ea ff ff ff eb 15 89 d0 47 e8 1b fd ff ff
3b 3c 24 7d 08 89 f2 <8b> 36 39 ea 75 c0 b0 01 86 03 89 e0 25 00 e0 ff
ff 8b 00 8b 80
Feb 24 15:47:01 purple kernel: EIP: [<c0130a9e>] futex_wake+0x88/0xb3
SS:ESP 0068:d46fbea4
postcheck:
1. Checked memory with memcheck86+.
2. Checked on different machines with same architecture but different
CPU's.
3. Scanned for rootkits
4. Exchanged NIC
5. Exchanged switch
6. Checked health of harddisks
7. Recompiled vanilla kernel 2.6.24-2
All but point 7 didnt mather. It seams that running a 2.6.24-2 kernel
(vanilla) reduces the change of running into an oops greatly. It took
several days with the 2.6.24-2 kernel, while under the curent 2.6.18-6
kernel (latest sec. fix of februari) it will take minutes to get the
kernel oops.
Updating libc6 from 2.3.6.ds1-13etch5 to libc6 2.7-6 from testing
reduces the frequency maybe a little. I can not confirm if it is related
to libc6.
-- System Information:
Debian Release: 4.0
APT prefers stable
APT policy: (500, 'stable')
Architecture: i386 (i686)
Shell: /bin/sh linked to /bin/bash
Kernel: Linux 2.6.18-6-486
Locale: LANG=nl_NL@euro, LC_CTYPE=nl_NL@euro (charmap=ISO-8859-15)
Reply to: