[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#368353: marked as done ("BUG: soft lockup detected on CPU#1!" in __d_lookup during cp -l)



Your message dated Thu, 11 Jan 2007 14:14:35 +0100
with message-id <20070111131435.GL26700@baikonur.stro.at>
and subject line Bug#368353: "BUG: soft lockup detected on CPU#1!" in __d_lookup during cp -l
has caused the attached Bug report to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what I am
talking about this indicates a serious mail system misconfiguration
somewhere.  Please contact me immediately.)

Debian bug tracking system administrator
(administrator, Debian Bugs database)

--- Begin Message ---
Package: linux-image-2.6.15-1-amd64-k8-smp
Version: 2.6.15-8
Severity: important

I am seeing occasional overnight system hangs.  At least one has been caused
by the following kernel problem (syslog entry follows)...

May 21 01:30:39 black kernel:  <3>BUG: soft lockup detected on CPU#1!
May 21 01:30:39 black kernel: CPU 1:
May 21 01:30:39 black kernel: Modules linked in: nfsd exportfs lp button ac battery ipv6 nfs lockd nfs_acl sunrpc sk98lin nls_iso8859
_1 nls_cp437 vfat fat dm_mod sr_mod sbp2 ide_generic ide_disk analog eth1394 snd_mpu401 snd_mpu401_uart gameport snd_intel8x0 snd_ac9
7_codec psmouse snd_ac97_bus parport_pc parport pcspkr snd_rawmidi snd_seq_device serio_raw snd_pcm snd_timer snd soundcore snd_page_
alloc floppy i2c_nforce2 joydev evdev i2c_core ext3 jbd mbcache ide_cd cdrom sd_mod sata_nv libata scsi_mod skge ohci1394 ieee1394 fo
rcedeth generic amd74xx ide_core ohci_hcd ehci_hcd thermal processor fan
May 21 01:30:39 black kernel: Pid: 10739, comm: cp Not tainted 2.6.15-1-amd64-k8-smp #2
May 21 01:30:39 black kernel: RIP: 0010:[__d_lookup+221/254] <ffffffff80187ed8>{__d_lookup+221}
May 21 01:30:39 black kernel: RSP: 0018:ffff810036e0bc78  EFLAGS: 00000286
May 21 01:30:39 black kernel: RAX: ffff81000b53cd08 RBX: ffff81004e05c270 RCX: 0000000000000012
May 21 01:30:39 black kernel: RDX: 0000000000028a0e RSI: 01870610903e8a0e RDI: ffff810024cd5cf8
May 21 01:30:39 black kernel: RBP: ffff810036e0bc38 R08: ffff810037112061 R09: 0000000000000001
May 21 01:30:39 black kernel: R10: 0000000000000004 R11: ffffffff801b91c0 R12: ffffffff88143c92
May 21 01:30:39 black kernel: R13: 0000000000000006 R14: ffff81001bb05720 R15: 0000000000000000
May 21 01:30:39 black kernel: FS:  00002aaaab36b6d0(0000) GS:ffffffff803e3880(0000) knlGS:00000000556d56b0
May 21 01:30:39 black kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May 21 01:30:39 black kernel: CR2: 0000000001f76028 CR3: 0000000014434000 CR4: 00000000000006e0
May 21 01:30:39 black kernel:
May 21 01:30:39 black kernel: Call Trace:<ffffffff80187ea9>{__d_lookup+174} <ffffffff8017e5b3>{do_lookup+42}
May 21 01:30:39 black kernel:        <ffffffff8017f07f>{__link_path_walk+2413} <ffffffff8017f5b0>{link_path_walk+89}
May 21 01:30:39 black kernel:        <ffffffff80181bca>{sys_link+221} <ffffffff8017fb10>{path_lookup+430}
May 21 01:30:39 black kernel:        <ffffffff8017fc30>{__user_walk+45} <ffffffff80179be1>{vfs_lstat+21}
May 21 01:30:39 black kernel:        <ffffffff80181bca>{sys_link+221} <ffffffff8017a0de>{sys_newlstat+17}
May 21 01:30:39 black kernel:        <ffffffff8010d792>{system_call+126}

The problem seems to have occured during the file copy phase of an "rsnapshot"
run (which happens every night).  During this phase, a recursive "cp -l"
command is used on a very large directory tree (to create a copy tree of
files hardlinked to the original tree).

>From other logs, it appears that there may have been a simultaneous "ls -R"
command being run in another process, although this may not be relevant.

Note that this is a dual-core AMD64 system running an SMP kernel so maybe this 
could be a locking problem.  I have not been able to find any similar report
in the Debian bugs system or reported in LKML.

-- System Information:
Debian Release: testing/unstable
  APT prefers testing
  APT policy: (990, 'testing')
Architecture: amd64 (x86_64)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.15-1-amd64-k8-smp
Locale: LANG=en_IE@euro, LC_CTYPE=en_IE@euro (charmap=ISO-8859-15) (ignored: LC_ALL set to en_IE@euro)

Versions of packages linux-image-2.6.15-1-amd64-k8-smp depends on:
ii  e2fsprogs     1.38+1.39-WIP-2006.04.09-1 ext2 file system utilities and lib
ii  initramfs-too 0.60                       tools for generating an initramfs
ii  module-init-t 3.2.2-2                    tools for managing Linux kernel mo

linux-image-2.6.15-1-amd64-k8-smp recommends no packages.

-- debconf information:
  linux-image-2.6.15-1-amd64-k8-smp/postinst/old-dir-initrd-link-2.6.15-1-amd64-k8-smp: true
  linux-image-2.6.15-1-amd64-k8-smp/postinst/depmod-error-initrd-2.6.15-1-amd64-k8-smp: false
  linux-image-2.6.15-1-amd64-k8-smp/prerm/removing-running-kernel-2.6.15-1-amd64-k8-smp: true
  linux-image-2.6.15-1-amd64-k8-smp/preinst/abort-overwrite-2.6.15-1-amd64-k8-smp:
  linux-image-2.6.15-1-amd64-k8-smp/postinst/create-kimage-link-2.6.15-1-amd64-k8-smp: true
  linux-image-2.6.15-1-amd64-k8-smp/preinst/lilo-has-ramdisk:
  linux-image-2.6.15-1-amd64-k8-smp/postinst/bootloader-test-error-2.6.15-1-amd64-k8-smp:
  linux-image-2.6.15-1-amd64-k8-smp/postinst/bootloader-error-2.6.15-1-amd64-k8-smp:
  linux-image-2.6.15-1-amd64-k8-smp/postinst/kimage-is-a-directory:
  linux-image-2.6.15-1-amd64-k8-smp/preinst/abort-install-2.6.15-1-amd64-k8-smp:
  linux-image-2.6.15-1-amd64-k8-smp/postinst/old-initrd-link-2.6.15-1-amd64-k8-smp: true
* linux-image-2.6.15-1-amd64-k8-smp/preinst/already-running-this-2.6.15-1-amd64-k8-smp:
  linux-image-2.6.15-1-amd64-k8-smp/preinst/elilo-initrd-2.6.15-1-amd64-k8-smp: true
  linux-image-2.6.15-1-amd64-k8-smp/prerm/would-invalidate-boot-loader-2.6.15-1-amd64-k8-smp: true
  linux-image-2.6.15-1-amd64-k8-smp/postinst/old-system-map-link-2.6.15-1-amd64-k8-smp: true
  linux-image-2.6.15-1-amd64-k8-smp/preinst/lilo-initrd-2.6.15-1-amd64-k8-smp: true
  linux-image-2.6.15-1-amd64-k8-smp/preinst/initrd-2.6.15-1-amd64-k8-smp:
  linux-image-2.6.15-1-amd64-k8-smp/preinst/bootloader-initrd-2.6.15-1-amd64-k8-smp: true
  linux-image-2.6.15-1-amd64-k8-smp/postinst/depmod-error-2.6.15-1-amd64-k8-smp: false
  linux-image-2.6.15-1-amd64-k8-smp/preinst/failed-to-move-modules-2.6.15-1-amd64-k8-smp:
  linux-image-2.6.15-1-amd64-k8-smp/preinst/overwriting-modules-2.6.15-1-amd64-k8-smp: true


--- End Message ---
--- Begin Message ---
On Thu, Jan 11, 2007 at 11:34:14AM +0000, Graham Cobb wrote:
> On Thursday 11 January 2007 10:00, maximilian attems wrote:
> > On Sun, 21 May 2006, Graham Cobb wrote:
> > > Package: linux-image-2.6.15-1-amd64-k8-smp
> > > Version: 2.6.15-8
> > > Severity: important
> > >
> > > I am seeing occasional overnight system hangs.  At least one has been
> > > caused by the following kernel problem (syslog entry follows)...
> >
> > is that fixed by 2.6.18 linux image for etch?
> 
> I certainly have not seen the problem since I have been running 2.6.18.
> 
> Graham

thanks a lot for the quick good news!!

closing

-- 
maks

--- End Message ---

Reply to: