[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#317798: kernel-image-2.6.11-1-686-smp: nfs locking can cause a process to hang forever



Package: kernel-image-2.6.11-1-686-smp
Version: 2.6.11-7
Severity: important


Setup:

nfs client machine running kernel-image-2.6.11-1-686-smp (also
verified with kernel-image-2.6.11-1-686).

nfs server running Fedora Core 2, 2.6.9-1.6_FC2smp.  (yes, this is old)

> dpkg -s evince
Package: evince
Version: 0.3.0-2

My homedir is in NFS on the above server.  It turns out this machine
has had a kernel oops in the nfs locking code, so it's in a bad state.
I run "strace -f evince" (the problem occurs identically without
strace, but with no diagnostics).  The trace ends with this:

[pid  3949] open("/home/marc/.recently-used", O_RDWR) = 16
[pid  3949] fstat64(16, {st_mode=S_IFREG|0600, st_size=5002, ...}) = 0
[pid  3949] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb5f8c000
[pid  3949] _llseek(16, 0, [0], SEEK_SET) = 0
[pid  3949] fcntl64(16, F_SETLK, {type=F_WRLCK, whence=SEEK_CUR, start=0, len=0}

The process is now hung forever until I reboot.  A process should never hang
this hard, no matter how broken the server is.

According to sysrq, the evince process is hung in the nfs locking
code:

evince        D C03802C0     0  3943   3870          3944       (NOTLB)
f75b1b3c 00000082 f7257020 c03802c0 00000000 00000040 f75f1004 00000000 
       f706a400 00000282 df8d1480 00000000 def39140 000f426e f7257170 df8d1480 
       f75b1b78 df8d150c f75b1b58 f8c63c6a df8d1480 00000282 f70e6200 f75b0000 
Call Trace:
 [<f8c63c6a>] __rpc_execute+0x13a/0x380 [sunrpc]
 [<c012b520>] autoremove_wake_function+0x0/0x60
 [<c012b520>] autoremove_wake_function+0x0/0x60
 [<f8c641e6>] rpc_new_task+0x36/0xb0 [sunrpc]
 [<f8c5f730>] rpc_call_sync+0x70/0xb0 [sunrpc]
 [<f8c4cd8e>] nlmclnt_call+0xae/0x200 [lockd]
 [<f8c4d235>] nlmclnt_lock+0x55/0x110 [lockd]
 [<c0164a4a>] locks_copy_lock+0x8a/0x90
 [<f8c4caa0>] nlmclnt_proc+0x240/0x330 [lockd]
 [<f8cc3c83>] do_setlk+0x83/0x1a0 [nfs]
 [<c0167077>] fcntl_setlk+0x2a7/0x300
 [<c0143968>] vma_link+0x48/0xc0
 [<c01445df>] do_mmap_pgoff+0x45f/0x780
 [<c0162757>] do_fcntl+0xd7/0x190
 [<c0162948>] sys_fcntl64+0xa8/0xc0
 [<c0102f33>] syscall_call+0x7/0xb

At worst, this should time out or something.


                Marc

-- System Information:
Debian Release: 3.1
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'unstable')
Architecture: i386 (i686)
Kernel: Linux 2.6.11-1-686
Locale: LANG=en_US.ISO8859-1, LC_CTYPE=en_US.ISO8859-1 (charmap=ISO-8859-1)

Versions of packages kernel-image-2.6.11-1-686-smp depends on:
ii  coreutils [fileutils]         5.2.1-2    The GNU core utilities
ii  initrd-tools                  0.1.77     tools to create initrd image for p
ii  module-init-tools             3.2-pre1-2 tools for managing Linux kernel mo

-- no debconf information



Reply to: