[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#463808: kernel crash (Cause : 0000041c)



Hi Thomas and Martin

Il giorno mar, 05/02/2008 alle 20.17 +0100, Thomas Bogendoerfer ha
scritto:
> On Tue, Feb 05, 2008 at 11:04:28AM -0700, Martin Michlmayr wrote:
> > CCing Thomas Bogendoerfer
> 
> this looks like a userspace program accessing some mmap-ped data,
> which is triggering a dbe. The first question is which application is it ?
> My first thought was an X server, but I don't even know, if there is
> one for the O2. What program are running, when the machine crashes ?

Crash happened in all scenarios: using X11 (via framebuffer) and
mozilla, or X11 and sylpheed, or tty and compiling the kernel, or tty
and managing email via courier-imapd-ssl.

Based on this[0] comment, I think that the problem is in libc6.
The machine was working very well, with all software from etch.

A few days ago I decided to use latest kernel from unstable (this is
2.6.24-2) but that kernel isn't compilable on etch because it require
gcc-4.2. So, I updated gcc, g++, binutils, libstdc++, libgcc1 and libc6
from unstable.

After this update nothing worked.
When I switched back to libc6/gcc/g++/libstdc++/libgcc1/binutils from
etch, everything went well.

Currently the machine is running etch. The actual mapping of those
libraries are:

giuseppe@sgi:~$ ldd /usr/bin/sylpheed  | grep -E '(libc\.|libstdc\+\+\.|libgcc)' 
	libc.so.6 => /lib/libc.so.6 (0x2c2a4000)
	libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x2c5c8000)
	libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x2c718000)

When trying to get back to my original configuration, I switched kernel
to 2.6.23.1 (leaving libraries from unstable) and tried to recompile the
kernel source using make-kpkg or dpkg-buildpackage. What happened was
that many program started segfaulting (signal 11 and 10). Programs like
"diff", "patch", and so on.

> 
> > > Got dbe at 0x2ac2bffc
> > > [...]
> 
> > > Index: 20 pgmask=4kb va=0007fd06000 asid=6b
> > >         [pa=0000c67b000 c=3 d=1 v=1 g=0] [pa=000032bf000 c=3 d=0 v=0 g=0]
> > > Index: 21 pgmask=4kb va=0002acf0000 asid=6b
> > >         [pa=0000d1c1000 c=3 d=0 v=1 g=0] [pa=0000c188000 c=3 d=0 v=0 g=0]
> > > Index: 25 pgmask=4kb va=0002acf0000 asid=6b
> > >         [pa=0000d1c1000 c=3 d=0 v=1 g=0] [pa=0000c188000 c=3 d=0 v=0 g=0]
> > > Index: 41 pgmask=4kb va=0002aac8000 asid=6b
> > >         [pa=0000368e000 c=3 d=0 v=0 g=0] [pa=000506c0000 c=3 d=1 v=1 g=0]
> 
> these are the only TLBs which are user space. What looks strange to me
> is Index 21 and Index 25. Both are mapping the same page. Looking at
> the R4600/4700 manual (didn't have a R5k manual handy) indicates no
> problem with duplicate TLB entries. I can't check, if pyhsical addresses
> are correct, because I don't know how memory is mapped. I need to see the
> CRIME MC messages from bootup. Even better is a complete boot log.

This is a boot log when using the kernel whipped with debian unstable.

Feb  4 02:25:44 sgi kernel: Linux version 2.6.24-1-r5k-ip32 (Debian 2.6.24-2) (waldi@debian.org) (gcc version 4.1.3 20080114 (prereleas
e) (Debian 4.1.2-19)) #1 Fri Feb 1 07:29:41 UTC 2008
Feb  4 02:25:44 sgi kernel: ARCH: SGI-IP32
Feb  4 02:25:44 sgi kernel: PROMLIB: ARC firmware Version 1 Revision 10
Feb  4 02:25:44 sgi kernel: CRIME id a rev 1 at 0x0000000014000000
Feb  4 02:25:44 sgi kernel: CRIME MC: bank 0 base 0x0000000000000000 size 128MiB
Feb  4 02:25:44 sgi kernel: CRIME MC: bank 1 base 0x0000000008000000 size 128MiB
Feb  4 02:25:44 sgi kernel: CRIME MC: bank 2 base 0x0000000050000000 size 32MiB
Feb  4 02:25:44 sgi kernel: CRIME MC: bank 3 base 0x0000000052000000 size 32MiB
Feb  4 02:25:44 sgi kernel: CRIME MC: bank 4 base 0x0000000054000000 size 32MiB
Feb  4 02:25:44 sgi kernel: CRIME MC: bank 5 base 0x0000000056000000 size 32MiB
Feb  4 02:25:44 sgi kernel: CPU revision is: 00002321 (R5000)
Feb  4 02:25:44 sgi kernel: FPU revision is: 00002310
Feb  4 02:25:44 sgi kernel: Determined physical RAM map:
Feb  4 02:25:44 sgi kernel:  memory: 0000000010000000 @ 0000000000000000 (usable)
Feb  4 02:25:44 sgi kernel:  memory: 0000000008000000 @ 0000000050000000 (usable)
Feb  4 02:25:44 sgi kernel: Entering add_active_range(0, 0, 65536) 0 entries of 256 used
Feb  4 02:25:44 sgi kernel: Entering add_active_range(0, 327680, 360448) 1 entries of 256 used
Feb  4 02:25:44 sgi kernel: Initrd not found or empty - disabling initrd
Feb  4 02:25:44 sgi kernel: Zone PFN ranges:
Feb  4 02:25:44 sgi kernel:   Normal          0 ->   360448
Feb  4 02:25:44 sgi kernel: Movable zone start PFN for each node
Feb  4 02:25:44 sgi kernel: early_node_map[2] active PFN ranges
Feb  4 02:25:44 sgi kernel:     0:        0 ->    65536
Feb  4 02:25:44 sgi kernel:     0:   327680 ->   360448
Feb  4 02:25:44 sgi kernel: On node 0 totalpages: 98304
Feb  4 02:25:44 sgi kernel:   Normal zone: 4928 pages used for memmap
Feb  4 02:25:44 sgi kernel:   Normal zone: 0 pages reserved
Feb  4 02:25:44 sgi kernel:   Normal zone: 93376 pages, LIFO batch:31
Feb  4 02:25:44 sgi kernel:   Movable zone: 0 pages used for memmap
Feb  4 02:25:44 sgi kernel: Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 93376
Feb  4 02:25:44 sgi kernel: Kernel command line: root=/dev/sda1 ro video=gbefb:1280x1024-16@60 console=tty0 console=ttyS1,115200
Feb  4 02:25:44 sgi kernel: Primary instruction cache 32kB, VIPT, 2-way, linesize 32 bytes.
Feb  4 02:25:44 sgi kernel: Primary data cache 32kB, 2-way, VIPT, cache aliases, linesize 32 bytes
Feb  4 02:25:44 sgi kernel: R5000 SCACHE size 1024kB, linesize 32 bytes.
Feb  4 02:25:44 sgi kernel: Synthesized clear page handler (15 instructions).
Feb  4 02:25:44 sgi kernel: Synthesized copy page handler (24 instructions).
Feb  4 02:25:44 sgi kernel: Synthesized TLB refill handler (38 instructions).
Feb  4 02:25:44 sgi kernel: Synthesized TLB load handler fastpath (51 instructions).
Feb  4 02:25:44 sgi kernel: Synthesized TLB store handler fastpath (51 instructions).
Feb  4 02:25:44 sgi kernel: Synthesized TLB modify handler fastpath (50 instructions).
Feb  4 02:25:44 sgi kernel: PID hash table entries: 2048 (order: 11, 16384 bytes)
Feb  4 02:25:44 sgi kernel: Calibrating system timer... 200 MHz CPU detected
Feb  4 02:25:44 sgi kernel: CRIME memory error at 0x3fffffe0 ST 0x0400a828<INV,RE,REID=0x28,NONFATAL>
Feb  4 02:25:44 sgi kernel: Console: colour dummy device 80x25
Feb  4 02:25:44 sgi kernel: console [tty0] enabled
Feb  4 02:25:44 sgi kernel: Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)
Feb  4 02:25:44 sgi kernel: Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)
Feb  4 02:25:44 sgi kernel: Memory: 367492k/393216k available (3196k kernel code, 25216k reserved, 1019k data, 200k init, 0k highmem)
Feb  4 02:25:44 sgi kernel: Calibrating delay loop... 199.16 BogoMIPS (lpj=398336)
Feb  4 02:25:44 sgi kernel: Security Framework initialized
Feb  4 02:25:44 sgi kernel: SELinux:  Disabled at boot.
Feb  4 02:25:44 sgi kernel: Capability LSM initialized
Feb  4 02:25:44 sgi kernel: Mount-cache hash table entries: 256
Feb  4 02:25:44 sgi kernel: Initializing cgroup subsys ns
Feb  4 02:25:44 sgi kernel: Initializing cgroup subsys cpuacct
Feb  4 02:25:44 sgi kernel: Checking for the multiply/shift bug... no.
Feb  4 02:25:44 sgi kernel: Checking for the daddi bug... no.
Feb  4 02:25:44 sgi kernel: Checking for the daddiu bug... no.
Feb  4 02:25:44 sgi kernel: net_namespace: 120 bytes
Feb  4 02:25:44 sgi kernel: NET: Registered protocol family 16
Feb  4 02:25:44 sgi kernel: MACE PCI rev 1
Feb  4 02:25:44 sgi kernel: SCSI subsystem initialized
Feb  4 02:25:44 sgi kernel: PCI: Bridge: 0000:00:03.0
Feb  4 02:25:44 sgi kernel:   IO window: 1000-1fff
Feb  4 02:25:44 sgi kernel:   MEM window: 80000000-800fffff
Feb  4 02:25:44 sgi kernel:   PREFETCH window: 80100000-801fffff
Feb  4 02:25:44 sgi kernel: PCI: Enabling device 0000:00:03.0 (0000 -> 0003)
Feb  4 02:25:44 sgi kernel: PCI: Setting latency timer of device 0000:00:03.0 to 64
Feb  4 02:25:44 sgi kernel: Time: MIPS clocksource has been installed.
Feb  4 02:25:44 sgi kernel: NET: Registered protocol family 2
Feb  4 02:25:44 sgi kernel: IP route cache hash table entries: 4096 (order: 3, 32768 bytes)
Feb  4 02:25:44 sgi kernel: TCP established hash table entries: 16384 (order: 6, 262144 bytes)
Feb  4 02:25:44 sgi kernel: TCP bind hash table entries: 16384 (order: 5, 131072 bytes)
Feb  4 02:25:44 sgi kernel: TCP: Hash tables configured (established 16384 bind 16384)
Feb  4 02:25:44 sgi kernel: TCP reno registered
Feb  4 02:25:44 sgi kernel: audit: initializing netlink socket (disabled)
Feb  4 02:25:44 sgi kernel: audit(1202088249.428:1): initialized
Feb  4 02:25:44 sgi kernel: VFS: Disk quotas dquot_6.5.1
Feb  4 02:25:44 sgi kernel: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
Feb  4 02:25:44 sgi kernel: Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
Feb  4 02:25:44 sgi kernel: io scheduler noop registered
Feb  4 02:25:44 sgi kernel: io scheduler anticipatory registered
Feb  4 02:25:44 sgi kernel: io scheduler deadline registered
Feb  4 02:25:44 sgi kernel: io scheduler cfq registered (default)
Feb  4 02:25:44 sgi kernel: Console: switching to colour frame buffer device 160x64
Feb  4 02:25:44 sgi kernel: fb0: SGI GBE rev 1 @ 0x16000000 using 4096kB memory
Feb  4 02:25:44 sgi kernel: Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
Feb  4 02:25:44 sgi kernel: serial8250.0: ttyS0 at MMIO 0x1f390000 (irq = 60) is a 16550A
Feb  4 02:25:44 sgi kernel: serial8250.0: ttyS1 at MMIO 0x1f398000 (irq = 66) is a 16550A
Feb  4 02:25:44 sgi kernel: console [ttyS1] enabled
Feb  4 02:25:44 sgi kernel: RAMDISK driver initialized: 16 RAM disks of 8192K size 1024 blocksize
Feb  4 02:25:44 sgi kernel: eth0: SGI MACE Ethernet rev. 1
Feb  4 02:25:44 sgi kernel: PCI: Enabling device 0000:00:01.0 (0046 -> 0047)
Feb  4 02:25:44 sgi kernel: ahc_pci:0:1:0: Using left over BIOS settings
Feb  4 02:25:44 sgi kernel: scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
Feb  4 02:25:44 sgi kernel:         <Adaptec aic7880 Ultra SCSI adapter>
Feb  4 02:25:44 sgi kernel:         aic7880: Wide Channel A, SCSI Id=0, 16/253 SCBs
Feb  4 02:25:44 sgi kernel: 
Feb  4 02:25:44 sgi kernel: PCI: Enabling device 0000:00:02.0 (0046 -> 0047)
Feb  4 02:25:44 sgi kernel: ahc_pci:0:2:0: Using left over BIOS settings
Feb  4 02:25:44 sgi kernel: scsi 0:0:1:0: Direct-Access     ModusLnk                       PQ: 0 ANSI: 3
Feb  4 02:25:44 sgi kernel: scsi0:A:1:0: Tagged Queuing enabled.  Depth 32
Feb  4 02:25:44 sgi kernel:  target0:0:1: Beginning Domain Validation
Feb  4 02:25:44 sgi kernel:  target0:0:1: wide asynchronous
Feb  4 02:25:44 sgi kernel:  target0:0:1: FAST-10 WIDE SCSI 20.0 MB/s ST (100 ns, offset 8)
Feb  4 02:25:44 sgi kernel:  target0:0:1: Domain Validation skipping write tests
Feb  4 02:25:44 sgi kernel:  target0:0:1: Ending Domain Validation
Feb  4 02:25:44 sgi kernel: scsi 0:0:2:0: Direct-Access     FUJITSU  MAG3182LC        5210 PQ: 0 ANSI: 2
Feb  4 02:25:44 sgi kernel: scsi0:A:2:0: Tagged Queuing enabled.  Depth 32
Feb  4 02:25:44 sgi kernel:  target0:0:2: Beginning Domain Validation
Feb  4 02:25:44 sgi kernel:  target0:0:2: wide asynchronous
Feb  4 02:25:44 sgi kernel:  target0:0:2: FAST-10 WIDE SCSI 20.0 MB/s ST (100 ns, offset 8)
Feb  4 02:25:44 sgi kernel:  target0:0:2: Domain Validation skipping write tests
Feb  4 02:25:44 sgi kernel:  target0:0:2: Ending Domain Validation
Feb  4 02:25:44 sgi kernel: scsi 0:0:4:0: CD-ROM            TOSHIBA  CD-ROM XM-5701TA 0167 PQ: 0 ANSI: 2
Feb  4 02:25:44 sgi kernel:  target0:0:4: Beginning Domain Validation
Feb  4 02:25:44 sgi kernel:  target0:0:4: FAST-10 SCSI 10.0 MB/s ST (100 ns, offset 8)
Feb  4 02:25:44 sgi kernel:  target0:0:4: Domain Validation skipping write tests
Feb  4 02:25:44 sgi kernel:  target0:0:4: Ending Domain Validation
Feb  4 02:25:44 sgi kernel: scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
Feb  4 02:25:44 sgi kernel:         <Adaptec aic7880 Ultra SCSI adapter>
Feb  4 02:25:44 sgi kernel:         aic7880: Wide Channel A, SCSI Id=0, 16/253 SCBs
Feb  4 02:25:44 sgi kernel: 
Feb  4 02:25:44 sgi kernel: Driver 'sd' needs updating - please use bus_type methods
Feb  4 02:25:44 sgi kernel: (scsi0:A:1:0): data overrun detected in Data-in phase.  Tag == 0x2.
Feb  4 02:25:44 sgi kernel: (scsi0:A:1:0): Have seen Data Phase.  Length = 0.  NumSGs = 1.
Feb  4 02:25:44 sgi kernel: sg[0] - Addr 0x017e0c040 : Length 32
Feb  4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] 143638992 512-byte hardware sectors (73543 MB)
Feb  4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] Write Protect is off
Feb  4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] Mode Sense: b3 00 00 08
Feb  4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Feb  4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] 143638992 512-byte hardware sectors (73543 MB)
Feb  4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] Write Protect is off
Feb  4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] Mode Sense: b3 00 00 08
Feb  4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Feb  4 02:25:44 sgi kernel:  sda: sda1 sda9 sda11
Feb  4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] Attached SCSI disk
Feb  4 02:25:44 sgi kernel:  target0:0:2: FAST-10 WIDE SCSI 20.0 MB/s ST (100 ns, offset 8)
Feb  4 02:25:44 sgi kernel: (scsi0:A:2:0): data overrun detected in Data-in phase.  Tag == 0x2.
Feb  4 02:25:44 sgi kernel: (scsi0:A:2:0): Have seen Data Phase.  Length = 0.  NumSGs = 1.
Feb  4 02:25:44 sgi kernel: sg[0] - Addr 0x017e0c040 : Length 32
Feb  4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] 35694860 512-byte hardware sectors (18276 MB)
Feb  4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] Write Protect is off
Feb  4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] Mode Sense: a7 00 10 08
Feb  4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA
Feb  4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] 35694860 512-byte hardware sectors (18276 MB)
Feb  4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] Write Protect is off
Feb  4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] Mode Sense: a7 00 10 08
Feb  4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA
Feb  4 02:25:44 sgi kernel:  sdb: sdb1 sdb2 sdb3 sdb9 sdb11
Feb  4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] Attached SCSI disk
Feb  4 02:25:44 sgi kernel: Driver 'sr' needs updating - please use bus_type methods
Feb  4 02:25:44 sgi kernel: sr0: scsi-1 drive
Feb  4 02:25:44 sgi kernel: Uniform CD-ROM driver Revision: 3.20
Feb  4 02:25:44 sgi kernel: sr 0:0:4:0: Attached scsi CD-ROM sr0
Feb  4 02:25:44 sgi kernel: mice: PS/2 mouse device common for all mice
Feb  4 02:25:44 sgi kernel: input: AT Raw Set 2 keyboard as /class/input/input0
Feb  4 02:25:44 sgi kernel: TCP bic registered
Feb  4 02:25:44 sgi kernel: Initializing XFRM netlink socket
Feb  4 02:25:44 sgi kernel: NET: Registered protocol family 1
Feb  4 02:25:44 sgi kernel: NET: Registered protocol family 17
Feb  4 02:25:44 sgi kernel: NET: Registered protocol family 15
Feb  4 02:25:44 sgi kernel: RPC: Registered udp transport module.
Feb  4 02:25:44 sgi kernel: RPC: Registered tcp transport module.
Feb  4 02:25:44 sgi kernel: registered taskstats version 1
Feb  4 02:25:44 sgi kernel: scsi: waiting for bus probes to complete ...
Feb  4 02:25:44 sgi kernel: input: PS/2 Logitech Mouse as /class/input/input1
Feb  4 02:25:44 sgi kernel: EXT3-fs: INFO: recovery required on readonly filesystem.
Feb  4 02:25:44 sgi kernel: EXT3-fs: write access will be enabled during recovery.
Feb  4 02:25:44 sgi kernel: kjournald starting.  Commit interval 5 seconds
Feb  4 02:25:44 sgi kernel: EXT3-fs: sda1: orphan cleanup on readonly fs
Feb  4 02:25:44 sgi kernel: ext3_orphan_cleanup: deleting unreferenced inode 6276169
Feb  4 02:25:44 sgi kernel: EXT3-fs: sda1: 1 orphan inode deleted
Feb  4 02:25:44 sgi kernel: EXT3-fs: recovery complete.
Feb  4 02:25:44 sgi kernel: EXT3-fs: mounted filesystem with ordered data mode.
Feb  4 02:25:44 sgi kernel: VFS: Mounted root (ext3 filesystem) readonly.
Feb  4 02:25:44 sgi kernel: Freeing unused kernel memory: 200k freed
Feb  4 02:25:44 sgi kernel: sd 0:0:1:0: Attached scsi generic sg0 type 0
Feb  4 02:25:44 sgi kernel: sd 0:0:2:0: Attached scsi generic sg1 type 0
Feb  4 02:25:44 sgi kernel: sr 0:0:4:0: Attached scsi generic sg2 type 5
Feb  4 02:25:44 sgi kernel: Adding 233464k swap on /dev/sdb3.  Priority:-1 extents:1 across:233464k
Feb  4 02:25:44 sgi kernel: EXT3 FS on sda1, internal journal
Feb  4 02:25:44 sgi kernel: device-mapper: uevent: version 1.0.3
Feb  4 02:25:44 sgi kernel: device-mapper: ioctl: 4.12.0-ioctl (2007-10-02) initialised: dm-devel@redhat.com
Feb  4 02:25:44 sgi kernel: process `syslogd' is using obsolete setsockopt SO_BSDCOMPAT
Feb  4 02:25:49 sgi kernel: NET: Registered protocol family 10
Feb  4 02:25:49 sgi kernel: lo: Disabled Privacy Extensions
Feb  4 02:25:51 sgi kernel: lp: driver loaded but no devices found
Feb  4 02:25:59 sgi kernel: eth0: no IPv6 routers present

Bye,
Giuseppe




Reply to: