[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#261893: kernel-image-2.6.6-1-generic: Kernel bug at mm/slab.c:1530



* Jan-Jaap van der Heijden wrote:
With a stock 2.6.6-1-generic kernel it crashes as soon as it
initilizes the SCSI controller.

Can you please try the 2.6.7 packages, and report if they fix your
problem (I suppose so)?

http://people.debian.org/~nobse/kernel-image-2.6.7-alpha/


2.6.7 doesn't lock up anymore when it hits the PCI bus. The CIA issue is solved.

But the oops is still there. Here's the trace:
===========================================================
ksymoops 2.4.9 on alpha 2.6.7-1-generic.  Options used
    -V (default)
    -k /proc/ksyms (default)
    -l /proc/modules (default)
    -o /lib/modules/2.6.7-1-generic/ (default)
    -m /boot/System.map-2.6.7-1-generic (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Error (regular_file): read_ksyms stat /proc/ksyms failed
ksymoops: No such file or directory
No modules in ksyms, skipping objects
No ksyms, skipping lsmod
4096K Bcache detected; load hit latency 28 cycles, load miss latency 106 cycles
Kernel bug at mm/slab.c:1552
modprobe(200): Kernel Bug 1
pc = [<fffffc00003568b4>] ra = [<fffffffc0023adf4>] ps = 0000 Not tainted
Using defaults from ksymoops -t elf64-alpha -a alpha
v0 = 0000000000000000  t0 = 0000000000000000  t1 = fffffc002fffc228
t2 = fffffc0000572fb0  t3 = fffffc002fc75330  t4 = fffffc0000600248
t5 = 0000000000000400  t6 = fffffc002f24a000  t7 = fffffc002f2dc000
a0 = 0000000000000000  a1 = fffffffc002458ed  a2 = ffffffffbaadf00d
a3 = 0000000000000000  a4 = fffffc002f2dfe28  a5 = fffffc002fc75228
t8 = 0000000000000000  t9 = fffffc000035639c  t10= 0000000000000030
t11= 0000000000000040  pv = fffffc0000356874  at = 0000000000000000
gp = fffffc00005f0300  sp = fffffc002f2dfe88
Trace:fffffc000034a0e0 fffffc0000314d14
Code: a0480068 243f1000 2021ff00 44410002 e4400004 00000081 <00000610> 004d0697


PC;  fffffc00003568b4 <kmem_cache_destroy+40/1f0>   <=====

Trace; fffffc000034a0e0 <sys_init_module+1f0/3a4>
Trace; fffffc0000314d14 <entSys+a4/c0>

Code;  fffffc000035689c <kmem_cache_destroy+28/1f0>
0000000000000000 <_PC>:
Code;  fffffc000035689c <kmem_cache_destroy+28/1f0>
  0:   68 00 48 a0       ldl  t1,104(t7)
Code;  fffffc00003568a0 <kmem_cache_destroy+2c/1f0>
  4:   00 10 3f 24       ldah t0,4096
Code;  fffffc00003568a4 <kmem_cache_destroy+30/1f0>
  8:   00 ff 21 20       lda  t0,-256(t0)
Code;  fffffc00003568a8 <kmem_cache_destroy+34/1f0>
  c:   02 00 41 44       and  t1,t0,t1
Code;  fffffc00003568ac <kmem_cache_destroy+38/1f0>
 10:   04 00 40 e4       beq  t1,24 <_PC+0x24>
Code;  fffffc00003568b0 <kmem_cache_destroy+3c/1f0>
 14:   81 00 00 00       bugchk
Code;  fffffc00003568b4 <kmem_cache_destroy+40/1f0>   <=====
 18:   10 06 00 00       call_pal     0x610   <=====
Code;  fffffc00003568b8 <kmem_cache_destroy+44/1f0>
 1c:   97 06 4d 00       call_pal     0x4d0697

SGI XFS with ACLs, security attributes, realtime, large block/inode numbers, no debug enabled

1 warning and 1 error issued.  Results may not be reliable.

===========================================================

The same oops is also in 2.6.8 from http://people.debian.org/~nobse/kernel-image-2.6.8-alpha/

I wonder the "SGI XFS with ..." means it's XFS that's the problem here.
In that case I might be hitting a variation on the bug that plagued Suse 9.1: http://portal.suse.com/sdb/en/2004/04/91_xfsfix.html

I browsed the code of their "hotfix" and what it does is "Disable BUGs in the SLAB allocator init. This makes XFS in the 9.1 install kernel work.". Ugh.

1) I'll have a look at the suse fix that went into the next suse kernel rpm. See what their real fix is and see if that works for this case.

2) Somewhere in the early 2.6.x releases, XFS switched from using their own allocator to using the slab allocator. If one of these works (with the core_cia patch), it would confirm my XFS/slab allocator suspicion. It would also explain why 2.4.x doesn't have this problem.

Jan-Jaap





Reply to: