Bug#317798: kernel-image-2.6.11-1-686-smp: nfs locking can cause a process to hang forever
Package: kernel-image-2.6.11-1-686-smp
Version: 2.6.11-7
Severity: important
Setup:
nfs client machine running kernel-image-2.6.11-1-686-smp (also
verified with kernel-image-2.6.11-1-686).
nfs server running Fedora Core 2, 2.6.9-1.6_FC2smp. (yes, this is old)
> dpkg -s evince
Package: evince
Version: 0.3.0-2
My homedir is in NFS on the above server. It turns out this machine
has had a kernel oops in the nfs locking code, so it's in a bad state.
I run "strace -f evince" (the problem occurs identically without
strace, but with no diagnostics). The trace ends with this:
[pid 3949] open("/home/marc/.recently-used", O_RDWR) = 16
[pid 3949] fstat64(16, {st_mode=S_IFREG|0600, st_size=5002, ...}) = 0
[pid 3949] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb5f8c000
[pid 3949] _llseek(16, 0, [0], SEEK_SET) = 0
[pid 3949] fcntl64(16, F_SETLK, {type=F_WRLCK, whence=SEEK_CUR, start=0, len=0}
The process is now hung forever until I reboot. A process should never hang
this hard, no matter how broken the server is.
According to sysrq, the evince process is hung in the nfs locking
code:
evince D C03802C0 0 3943 3870 3944 (NOTLB)
f75b1b3c 00000082 f7257020 c03802c0 00000000 00000040 f75f1004 00000000
f706a400 00000282 df8d1480 00000000 def39140 000f426e f7257170 df8d1480
f75b1b78 df8d150c f75b1b58 f8c63c6a df8d1480 00000282 f70e6200 f75b0000
Call Trace:
[<f8c63c6a>] __rpc_execute+0x13a/0x380 [sunrpc]
[<c012b520>] autoremove_wake_function+0x0/0x60
[<c012b520>] autoremove_wake_function+0x0/0x60
[<f8c641e6>] rpc_new_task+0x36/0xb0 [sunrpc]
[<f8c5f730>] rpc_call_sync+0x70/0xb0 [sunrpc]
[<f8c4cd8e>] nlmclnt_call+0xae/0x200 [lockd]
[<f8c4d235>] nlmclnt_lock+0x55/0x110 [lockd]
[<c0164a4a>] locks_copy_lock+0x8a/0x90
[<f8c4caa0>] nlmclnt_proc+0x240/0x330 [lockd]
[<f8cc3c83>] do_setlk+0x83/0x1a0 [nfs]
[<c0167077>] fcntl_setlk+0x2a7/0x300
[<c0143968>] vma_link+0x48/0xc0
[<c01445df>] do_mmap_pgoff+0x45f/0x780
[<c0162757>] do_fcntl+0xd7/0x190
[<c0162948>] sys_fcntl64+0xa8/0xc0
[<c0102f33>] syscall_call+0x7/0xb
At worst, this should time out or something.
Marc
-- System Information:
Debian Release: 3.1
APT prefers testing
APT policy: (990, 'testing'), (500, 'unstable')
Architecture: i386 (i686)
Kernel: Linux 2.6.11-1-686
Locale: LANG=en_US.ISO8859-1, LC_CTYPE=en_US.ISO8859-1 (charmap=ISO-8859-1)
Versions of packages kernel-image-2.6.11-1-686-smp depends on:
ii coreutils [fileutils] 5.2.1-2 The GNU core utilities
ii initrd-tools 0.1.77 tools to create initrd image for p
ii module-init-tools 3.2-pre1-2 tools for managing Linux kernel mo
-- no debconf information
Reply to: