[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#639438: heavy NFS client usage triggers kernel BUG



Package: linux-2.6
Version: 2.6.32-35
Severity: normal

Hi,

heavy NFS client usage seems to trigger the kernel BUG below.  Fedora
seems to have encountered the same on s390x, and there's even a code
snippet identified, but the report timed out:
https://bugzilla.redhat.com/show_bug.cgi?id=508916

Kind regards
Philipp Kern

-- Package-specific info:
** Version:
Linux version 2.6.32-5-s390x (Debian 2.6.32-35) (dannf@debian.org) (gcc version 4.3.5 (Debian 4.3.5-4) ) #1 SMP Tue Jun 14 09:22:54 UTC 2011

** Command line:
root=/dev/disk/by-path/ccw-0.0.5204-part2 BOOT_IMAGE=0

** Tainted: D (128)
 * Kernel has oopsed before.

** Kernel log:
[619921.521773] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[619921.521779] ffsb          D 0000000000362ea4     0  3090   3060 0x00040000
[619921.521787] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[619921.521792]        0000000000000000 0000000000000001 0000000000000001 00000000c9bcfc60 
[619921.521799]        0000000000000001 0000000000000000 00000000fac9c830 0000000000000000 
[619921.521806]        0000000000000003 0000000083897300 00000000ffe40e70 0000000000368fd0 
[619921.521813]        0000000000368c20 0000000000362090 00000000c9bcfc60 00000000c9bcfd68 
[619921.521821] Call Trace:
[619921.521823] ([<0000000000362090>] schedule+0xb0c/0xc30)
[619921.521836]  [<0000000000362ea4>] __mutex_lock_slowpath+0xd4/0x14c
[619921.521844]  [<00000000003633a2>] mutex_lock+0x4e/0x6c
[619921.521849]  [<000000000012f54a>] generic_file_llseek+0x36/0x74
[619921.521856]  [<000000000012de98>] SyS_llseek+0x68/0xd4
[619921.521864]  [<00000000000299e6>] sysc_noemu+0x10/0x16
[619921.521872]  [<0000000077fca9ce>] 0x77fca9ce
[622081.545953] INFO: task ffsb:3694 blocked for more than 120 seconds.
[622081.545961] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[622081.545967] ffsb          D 0000000000362ea4     0  3694   3671 0x00040000
[622081.545975] 0000000083897368 00000000003685d8 000000000003e60a 00000000018dbc20 
[622081.545981]        00000000018dbc20 00000000ffcd0770 00000000ffcd0770 00000000018dbc60 
[622081.545988]        00000000003685d8 0000000000000000 00000000fac9f380 00000000feebd238 
[622081.545995]        0000000000000003 0000000083897300 00000000ffcd0738 00000000fac9f7a8 
[622081.546002]        0000000000368c20 0000000000362090 00000000018dbc60 00000000018dbd68 
[622081.546009] Call Trace:
[622081.546012] ([<0000000000362090>] schedule+0xb0c/0xc30)
[622081.546025]  [<0000000000362ea4>] __mutex_lock_slowpath+0xd4/0x14c
[622081.546032]  [<00000000003633a2>] mutex_lock+0x4e/0x6c
[622081.546038]  [<000000000012f54a>] generic_file_llseek+0x36/0x74
[622081.546045]  [<000000000012de98>] SyS_llseek+0x68/0xd4
[622081.546052]  [<00000000000299e6>] sysc_noemu+0x10/0x16
[622081.546061]  [<0000000077fca9ce>] 0x77fca9ce
[622081.546083] INFO: task ffsb:3706 blocked for more than 120 seconds.
[622081.546086] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[622081.546091] ffsb          D 0000000000362ea4     0  3706   3671 0x00040000
[622081.546098] 0000000083887368 00000000003685d8 000000000003e60a 000000003954bc20 
[622081.546104]        000000003954bc20 00000000ffcd0038 00000000ffcd0038 000000003954bc60 
[622081.546111]        00000000003685d8 0000000000000000 00000000fa9f40f8 00000000feebd238 
[622081.546118]        0000000000000002 0000000083887300 00000000ffcd0000 00000000fa9f4520 
[622081.546125]        0000000000368c20 0000000000362090 000000003954bc60 000000003954bd68 
[622081.546132] Call Trace:
[622081.546134] ([<0000000000362090>] schedule+0xb0c/0xc30)
[622081.546140]  [<0000000000362ea4>] __mutex_lock_slowpath+0xd4/0x14c
[622081.546146]  [<00000000003633a2>] mutex_lock+0x4e/0x6c
[622081.546152]  [<000000000012f54a>] generic_file_llseek+0x36/0x74
[622081.546157]  [<000000000012de98>] SyS_llseek+0x68/0xd4
[622081.546163]  [<00000000000299e6>] sysc_noemu+0x10/0x16
[622081.546169]  [<0000000077fca9ce>] 0x77fca9ce
[622081.546178] INFO: task ffsb:3729 blocked for more than 120 seconds.
[622081.546182] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[622081.546187] ffsb          D 0000000000362ea4     0  3729   3671 0x00040000
[622081.546194] 0000000083897368 00000000003685d8 000000000003e60a 000000004e53bc20 
[622081.546200]        000000004e53bc20 0000000063970770 0000000063970770 000000004e53bc60 
[622081.546207]        00000000003685d8 0000000000000000 0000000016d9f380 00000000fef810e0 
[622081.546214]        0000000016d9f720 0000000083897300 0000000063970738 0000000000000001 
[622081.546220]        0000000000368c20 0000000000362090 000000004e53bc60 000000004e53bd68 
[622081.546228] Call Trace:
[622081.546230] ([<0000000000362090>] schedule+0xb0c/0xc30)
[622081.546236]  [<0000000000362ea4>] __mutex_lock_slowpath+0xd4/0x14c
[622081.546242]  [<00000000003633a2>] mutex_lock+0x4e/0x6c
[622081.546247]  [<000000000012f54a>] generic_file_llseek+0x36/0x74
[622081.546252]  [<000000000012de98>] SyS_llseek+0x68/0xd4
[622081.546332]  [<00000000000299e6>] sysc_noemu+0x10/0x16
[622081.546337]  [<0000000077fca9ce>] 0x77fca9ce
[696130.782415] ------------[ cut here ]------------
[696130.782425] kernel BUG at /build/buildd-linux-2.6_2.6.32-35-s390-hTJpTP/linux-2.6-2.6.32/debian/build/source_s390_none/fs/nfs/write.c:271!
[696130.782434] illegal operation: 0001 [#1] SMP 
[696130.782439] Modules linked in: fuse nfs lockd fscache nfs_acl auth_rpcgss sunrpc sd_mod crc_t10dif ext2 sg loop qeth_l3 qeth ccwgroup ext4 mbcache jbd2 crc16 zfcp scsi_transport_fc scsi_tgt scsi_mod dasd_eckd_mod dasd_mod [last unloaded: scsi_wait_scan]
[696130.782469] CPU: 0 Not tainted 2.6.32-5-s390x #1
[696130.782473] Process blktrace (pid: 12537, task: 00000000fd16b9c0, ksp: 00000000c9ff7940)
[696130.782479] Krnl PSW : 0704300180000000 000003e000e047be (nfs_do_writepage+0x166/0x1b0 [nfs])
[696130.782508]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:3 PM:0 EA:3
[696130.782514] Krnl GPRS: 0000000000000000 0000000000000003 00000000000002f1 00000000fba09c00
[696130.782520]            00000000000002f0 0000000000523800 000003e0410f0f70 0000000000000001
[696130.782525]            00000000c9ff7bd8 0000000000000000 000003e0410f0f70 000000006a65fc00
[696130.782531]            000003e000deb000 000003e000e24308 000003e000e0473e 00000000c9ff7980
[696130.782548] Krnl Code: 000003e000e047b2: a7f40001		brc	15,3e000e047b4
[696130.782555]            000003e000e047b6: a7f40000		brc	15,3e000e047b6
[696130.782561]            000003e000e047ba: a7f40001		brc	15,3e000e047bc
[696130.782568]           >000003e000e047be: a7f40000		brc	15,3e000e047be
[696130.782575]            000003e000e047c2: b9040028		lgr	%r2,%r8
[696130.782581]            000003e000e047c6: b904003b		lgr	%r3,%r11
[696130.782588]            000003e000e047ca: c0e5ffffda3b	brasl	%r14,3e000dffc40
[696130.782594]            000003e000e047d0: 1222		ltr	%r2,%r2
[696130.782599] Call Trace:
[696130.782601] ([<000003e000e0473e>] nfs_do_writepage+0xe6/0x1b0 [nfs])
[696130.782612]  [<000003e000e04f2e>] nfs_writepages_callback+0x1e/0x3c [nfs]
[696130.782624]  [<00000000000ea27a>] write_cache_pages+0x28e/0x3e8
[696130.782634]  [<000003e000e04eb2>] nfs_writepages+0x13a/0x198 [nfs]
[696130.782645]  [<00000000000e20f0>] __filemap_fdatawrite_range+0x5c/0x68
[696130.782654]  [<00000000000e327a>] filemap_write_and_wait+0x36/0x64
[696130.782659]  [<000003e000df856e>] nfs_setattr+0xe2/0x154 [nfs]
[696130.782670]  [<0000000000149510>] notify_change+0x1a8/0x320
[696130.782678]  [<000000000012d500>] do_truncate+0x8c/0xb0
[696130.782685]  [<000000000012d64c>] SyS_ftruncate+0x128/0x150
[696130.782690]  [<00000000000299e6>] sysc_noemu+0x10/0x16
[696130.782699]  [<0000000077f2ca08>] 0x77f2ca08
[696130.782703] Last Breaking-Event-Address:
[696130.782706]  [<000003e000e047ba>] nfs_do_writepage+0x162/0x1b0 [nfs]
[696130.782717]  
[696130.782721] ---[ end trace d185bc0ed90606e2 ]---

** Model information
processor 0: version = 00,  identification = 07A456,  machine = 2098
processor 1: version = 00,  identification = 07A456,  machine = 2098

** Loaded modules:
Module                  Size  Used by
fuse                   73673  1 
nfs                   350785  1 
lockd                  80027  1 nfs
fscache                56819  1 nfs
nfs_acl                 2975  1 nfs
auth_rpcgss            48711  1 nfs
sunrpc                231007  10 nfs,lockd,nfs_acl,auth_rpcgss
sd_mod                 39115  1 
crc_t10dif              1428  1 sd_mod
ext2                   80395  0 
sg                     33421  0 
loop                   15959  0 
qeth_l3                62702  1 
qeth                   77977  1 qeth_l3
ccwgroup                7786  1 qeth
ext4                  425279  2 
mbcache                 8226  2 ext2,ext4
jbd2                  108244  1 ext4
crc16                   1471  1 ext4
zfcp                  116815  2 [permanent]
scsi_transport_fc      50772  1 zfcp
scsi_tgt               12066  1 scsi_transport_fc
scsi_mod              164906  5 sd_mod,sg,zfcp,scsi_transport_fc,scsi_tgt
dasd_eckd_mod          69050  5 
dasd_mod               63781  4 dasd_eckd_mod

** Network interface configuration:

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
	address 141.3.52.117
	netmask 255.255.255.0
	network 141.3.52.0
	broadcast 141.3.52.255
	gateway 141.3.52.254
	dns-nameservers 141.3.10.90 141.3.10.91
	dns-search ipd.uni-karlsruhe.de

** Network status:
*** IP interfaces and addresses:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1492 qdisc pfifo_fast state UNKNOWN qlen 1000
    link/ether e4:1f:13:4e:23:c4 brd ff:ff:ff:ff:ff:ff
    inet 141.3.52.117/24 brd 141.3.52.255 scope global eth0
    inet6 fe80::e41f:1300:94e:23c4/64 scope link 
       valid_lft forever preferred_lft forever

*** Device statistics:
Inter-|   Receive                                                |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
    lo:   15052     166    0    0    0     0          0         0    15052     166    0    0    0     0       0          0
  eth0:226187394999 294351209    0    0    0     0          0    228356 388687209775 331136914    0    0    0     0       0          0

*** Protocol statistics:
Ip:
    294348938 total packets received
    2586 with invalid addresses
    0 forwarded
    0 incoming packets discarded
    294346352 incoming packets delivered
    331137074 requests sent out
Icmp:
    22 ICMP messages received
    0 input ICMP message failed.
    ICMP input histogram:
        destination unreachable: 1
        echo requests: 21
    25 ICMP messages sent
    0 ICMP messages failed
    ICMP output histogram:
        destination unreachable: 4
        echo replies: 21
IcmpMsg:
        InType3: 1
        InType8: 21
        OutType0: 21
        OutType3: 4
Tcp:
    98 active connections openings
    18 passive connection openings
    30 failed connection attempts
    3 connection resets received
    1 connections established
    294122298 segments received
    331132891 segments send out
    3462 segments retransmited
    0 bad segments received.
    21 resets sent
Udp:
    695 packets received
    4 packets to unknown port received.
    0 packet receive errors
    696 packets sent
UdpLite:
TcpExt:
    41 TCP sockets finished time wait in fast timer
    17922 delayed acks sent
    466 delayed acks further delayed because of locked socket
    Quick ack mode was activated 6884 times
    49 packets directly queued to recvmsg prequeue.
    598 bytes directly received in process context from prequeue
    152678211 packet headers predicted
    4 packets header predicted and directly queued to user
    8441855 acknowledgments not containing data payload received
    134992813 predicted acknowledgments
    110 times recovered from packet loss by selective acknowledgements
    2 bad SACK blocks received
    Detected reordering 1 times using FACK
    150 TCP data loss events
    3429 fast retransmits
    1 forward retransmits
    4 retransmits in slow start
    7 other TCP timeouts
    1 SACK retransmits failed
    6884 DSACKs sent for old packets
    110 DSACKs received
    2 connections reset due to unexpected data
    1 connections reset due to early user close
    5 connections aborted due to timeout
    TCPSACKDiscard: 5
    TCPDSACKIgnoredOld: 108
    TCPDSACKIgnoredNoUndo: 2
    TCPSackShiftFallback: 21113
IpExt:
    InBcastPkts: 223333
    InOctets: -1
    OutOctets: -1
    InBcastOctets: 37742112

*** Device features:
eth0: 0x380
lo: 0x13865

** PCI devices:

-- System Information:
Debian Release: 6.0.2
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: s390 (s390x)

Kernel: Linux 2.6.32-5-s390x (SMP w/2 CPU cores)
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)
Shell: /bin/sh linked to /bin/dash

Versions of packages linux-image-2.6.32-5-s390x depends on:
ii  debconf [debconf-2.0]         1.5.36.1   Debian configuration management sy
ii  initramfs-tools [linux-initra 0.98.8     tools for generating an initramfs
ii  linux-base                    2.6.32-35  Linux image base package
ii  module-init-tools             3.12-1     tools for managing Linux kernel mo

Versions of packages linux-image-2.6.32-5-s390x recommends:
pn  firmware-linux-free           <none>     (no description available)

Versions of packages linux-image-2.6.32-5-s390x suggests:
pn  linux-doc-2.6.32              <none>     (no description available)
ii  s390-tools                    1.8.3-3    A set of fundamental utilities for

Versions of packages linux-image-2.6.32-5-s390x is related to:
pn  firmware-bnx2                 <none>     (no description available)
pn  firmware-bnx2x                <none>     (no description available)
pn  firmware-ipw2x00              <none>     (no description available)
pn  firmware-ivtv                 <none>     (no description available)
pn  firmware-iwlwifi              <none>     (no description available)
pn  firmware-linux                <none>     (no description available)
pn  firmware-linux-nonfree        <none>     (no description available)
pn  firmware-qlogic               <none>     (no description available)
pn  firmware-ralink               <none>     (no description available)
pn  xen-hypervisor                <none>     (no description available)

-- debconf information:
  linux-image-2.6.32-5-s390x/prerm/removing-running-kernel-2.6.32-5-s390x: true
  linux-image-2.6.32-5-s390x/postinst/missing-firmware-2.6.32-5-s390x:
  linux-image-2.6.32-5-s390x/postinst/ignoring-do-bootloader-2.6.32-5-s390x:
  linux-image-2.6.32-5-s390x/postinst/depmod-error-initrd-2.6.32-5-s390x: false



Reply to: