[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#516376: linux-image-2.6.26-1-amd64: victim of #518431... now got this one upon restart of nfs-kernel-server



Package: linux-image-2.6.26-1-amd64
Followup-For: Bug #516376


Reporting from another box (to make sure that it reaches the
destination), so I've removed all automaticaly included information.

I experience http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=518431
which is pending for upload with a fix into debian

usually doing

/etc/init.d/nfs-kernel-server restart

on the server helped. Today I saw that already 10 nfs clients stalled,
decided to resort to doing nfs-kernel-restart, but upon doing that I got

[1884967.206652] ------------[ cut here ]------------
[1884967.206652] kernel BUG at include/linux/module.h:386!
[1884967.216519] invalid opcode: 0000 [1] SMP
[1884967.216519] CPU 0
[1884967.216519] Modules linked in: ipmi_devintf ipmi_si nfsd auth_rpcgss fuse dm_crypt crypto_blkcipher qla2xxx netconsole configfs i2c_dev adm1026 w83781d w83627hf hwmon_vid loop r8169 bonding sr_mod st edd eeprom ipmi_watchdog ipmi_msghandler nfs lockd nfs_acl sunrpc k8temp dm_mirror dm_log dm_snapshot raid1 md_mod ide_pci_generic firmware_class scsi_transport_fc scsi_tgt ata_generic e1000 tg3 fan thermal processor thermal_sys amd74xx ohci_hcd sd_mod ide_generic ide_cd_mod cdrom i2c_amd8111 shpchp pci_hotplug i2c_amd756 i2c_core serio_raw evdev joydev psmouse pcspkr floppy parport_pc parport sg ide_disk ide_core exportfs xfs battery ac button ipv6 dm_mod reiserfs arcmsr sata_promise libata scsi_mod dock [last unloaded: qla2xxx]
[1884967.291542] Pid: 21073, comm: nfsd Tainted: G   M      2.6.26-1-amd64 #1
[1884967.291542] RIP: 0010:[<ffffffffa032bce2>]  [<ffffffffa032bce2>] :sunrpc:svc_recv+0x421/0x743
[1884967.291542] RSP: 0000:ffff810158a5fe90  EFLAGS: 00010246
[1884967.291542] RAX: 0000000000000000 RBX: ffffffffa0343b80 RCX: 0000000000000000
[1884967.291542] RDX: 0000000000001000 RSI: ffff810158a5fdb0 RDI: ffffffffa0343b80
[1884967.291542] RBP: ffff81016a648000 R08: ffff81012ae18200 R09: 0000000000000000
[1884967.291542] R10: ffffffff805257c0 R11: ffffffff803fe4cf R12: ffff81017ec6b400
[1884967.291542] R13: 0000000000000082 R14: ffff81007e1aca00 R15: ffff810115b95780
[1884967.291542] FS:  00007f91aac2f6e0(0000) GS:ffffffff8053c000(0000) knlGS:0000000000000000
[1884967.291542] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[1884967.291542] CR2: 00000000025dcee4 CR3: 000000011362b000 CR4: 00000000000006e0
[1884967.291542] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[1884967.291542] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[1884967.291542] Process nfsd (pid: 21073, threadinfo ffff810158a5e000, task ffff81015fd187d0)
[1884967.291542] Stack:  ffffffffa044e67c 00000000000dbba0 0000000000000000 ffff81015fd187d0
[1884967.291542]  ffffffff8022c202 0000000000000000 0000000000000000 0000000000000286
[1884967.291542]  ffffffff804f9b20 ffff81007dcc2f80 ffff81016a648000 ffffffffa044e67c
[1884967.291542] Call Trace:
[1884967.291542]  [<ffffffffa044e67c>] ? :nfsd:nfsd+0x0/0x2a4
[1884967.291542]  [<ffffffff8022c202>] ? default_wake_function+0x0/0xe
[1884967.291542]  [<ffffffffa044e67c>] ? :nfsd:nfsd+0x0/0x2a4
[1884967.291542]  [<ffffffffa044e767>] ? :nfsd:nfsd+0xeb/0x2a4
[1884967.291542]  [<ffffffff80230196>] ? schedule_tail+0x27/0x5c
[1884967.291542]  [<ffffffff8020cf28>] ? child_rip+0xa/0x12
[1884967.291542]  [<ffffffffa044e67c>] ? :nfsd:nfsd+0x0/0x2a4
[1884967.291542]  [<ffffffffa044e67c>] ? :nfsd:nfsd+0x0/0x2a4
[1884967.291542]  [<ffffffffa044e67c>] ? :nfsd:nfsd+0x0/0x2a4
[1884967.291542]  [<ffffffff8020cf1e>] ? child_rip+0x0/0x12
[1884967.291542]
[1884967.291542]
[1884967.291542] Code: 08 4c 89 e7 ff 50 08 48 85 c0 49 89 c6 0f 84 27 01 00 00 48 8b 00 48 8b 58 08 48 85 db 74 26 48 89 df e8 fd 65 f2 df 85 c0 75 04 <0f> 0b eb fe 65 8b 04 25 24 00 00 00 89 c0 48 c1 e0 07 48 ff 84
[1884967.291542] RIP  [<ffffffffa032bce2>] :sunrpc:svc_recv+0x421/0x743
[1884967.291542]  RSP <ffff810158a5fe90>
[1884967.543910] ---[ end trace 9f735c7771e1b7a9 ]---

which seems to be what this bug report was showing (if I read wrapped lines
correctly ;))

after this has happened, nothing (ie nfs-kernel nor nfs-common restarts)
helped to bring clients connectivity to the server back -- they are all
stalled :-/ can't remove nfsd kernel module although all nfs processes
seemed to be stopped at that point...

meanwhile I am building kernel patched as suggested in #518431 and
will see how it goes from there.



Reply to: