[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#522726: marked as done (kernel problem after a simple 'rm' command: RESERVE_SPACE(805) failed in function encode_lookup)



Your message dated Tue, 06 Apr 2010 01:04:41 +0100
with message-id <1270512281.24287.68.camel@localhost>
and subject line Re: kernel problem after a simple 'rm' command: RESERVE_SPACE(805)  failed in function encode_lookup
has caused the Debian Bug report #522726,
regarding kernel problem after a simple 'rm' command: RESERVE_SPACE(805) failed  in function encode_lookup
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
522726: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=522726
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: nfs-kernel-server
Version: 1:1.0.10-6+etch.1
Severity: important

My very stable server crashed as a result of a 'rm' command in an
NFS-mounted home directory. The 'rm' command was a file name (with
newlines) but that file did not exist.

The NFS client and the NFS server were the same machine.

Surprisingly, this cause a big problem inside the Kernel - the stack
trace shows a large amount of NFS system calls.

Here is what I did and what I got in response:
alevchuk@biocluster:~/.html/cellwall$ rm
'source_fasta_tair-v20080412-seq---_downloaded-2009-04-04
> source_fasta_tair-v20080412-pep---_downloaded-2009-04-04
> source_fasta_tair-v20080412-cds---_downloaded-2009-04-04
> source_fasta_tair-v20080412-cdna--_downloaded-2009-04-04
> source_fasta_tair-v20080229-igenic_downloaded-2009-04-04
> source_fasta_tair-v20080228-intron_downloaded-2009-04-04
> source_fasta_tigr-v6-0-all-seq----_downloaded-2009-04-04
> source_fasta_tigr-v6-0-all-pep----_downloaded-2009-04-04
> source_fasta_jgi-poptr-v1-1_prot--_downloaded-2009-04-04
> source_fasta_jgi-phypa-v1-1_trans-_downloaded-2009-04-04
> source_fasta_jgi-phypa-v1-1_prot--_downloaded-2009-04-04
> source_fasta_uniprot-v14-9-_tremb-_downloaded-2009-04-04
> source_fasta_uniprot-v14-9-_sprot-_downloaded-2009-04-04
> source_fasta_jgi-poptr-v1-1_trans-_downloaded-2009-04-04'
Segmentation fault

Message from syslogd@biocluster at Sat Apr  4 23:06:56 2009 ...
biocluster kernel: ------------[ cut here ]------------

Message from syslogd@biocluster at Sat Apr  4 23:06:56 2009 ...
biocluster kernel: invalid opcode: 0000 [1] SMP

Message from syslogd@biocluster at Sat Apr  4 23:06:56 2009 ...
biocluster kernel: invalid opcode: 0000 [1] SMP

Message from syslogd@biocluster at Sat Apr  4 23:06:56 2009 ...
biocluster kernel: ------------[ cut here ]------------


Here is what /var/log/messages showed immediately after:

Apr  4 22:39:40 biocluster -- MARK --
Apr  4 22:59:40 biocluster -- MARK --
Apr  4 23:06:56 biocluster kernel: RESERVE_SPACE(805) failed in
function encode_lookup
Apr  4 23:06:56 biocluster kernel: CPU 15
Apr  4 23:06:56 biocluster kernel: Modules linked in: tcp_diag
inet_diag nfsd exportfs button ac battery autofs4 ib_ipoib ipv6 nfs
lockd nfs_acl sunrpc quota_v1 ext2 ext3 jbd mbcache dm_snapshot
dm_mirror dm_mod qla2xxx mppVhba mppUpper sg rdma_ucm rdma_cm ib_cm
iw_cm ib_sa ib_addr ib_umad ib_ipath ib_uverbs mlx4_ib ib_mad ib_core
loop psmouse serio_raw i2c_i801 i2c_core shpchp pci_hotplug pcspkr
mlx4_core igb evdev xfs ide_cd cdrom ata_generic sd_mod ata_piix
libata piix generic ide_core ehci_hcd uhci_hcd firmware_class
scsi_transport_fc mptsas mptscsih mptbase e1000 scsi_transport_sas
scsi_mod thermal processor fan
Apr  4 23:06:56 biocluster kernel: Pid: 12459, comm: rm Not tainted
2.6.22-3-amd64 #1
Apr  4 23:06:56 biocluster kernel: RIP: 0010:[<ffffffff88406bed>]
[<ffffffff88406bed>] :nfs:encode_lookup+0x34/0x5c
Apr  4 23:06:56 biocluster kernel: RSP: 0018:ffff81053e8b38d8  EFLAGS: 00010292
Apr  4 23:06:56 biocluster kernel: RAX: 0000000000000037 RBX:
000000000000031d RCX: ffffffff804afd28
Apr  4 23:06:56 biocluster kernel: RDX: ffffffff804afd28 RSI:
0000000000000092 RDI: ffffffff804afd20
Apr  4 23:06:56 biocluster kernel: RBP: 0000000000000325 R08:
ffffffff804afd28 R09: 0000000000000000
Apr  4 23:06:56 biocluster kernel: R10: 0000000000000046 R11:
ffff8100010ceb40 R12: ffff81070967edb0
Apr  4 23:06:56 biocluster kernel: R13: ffff810e2c4343a8 R14:
ffffffff88408091 R15: ffff81070967edb0
Apr  4 23:06:56 biocluster kernel: FS:  00002b5b8bc496e0(0000)
GS:ffff810f0463a6c0(0000) knlGS:0000000000000000
Apr  4 23:06:56 biocluster kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Apr  4 23:06:56 biocluster kernel: CR2: 0000000000403940 CR3:
0000000b7e1ee000 CR4: 00000000000006e0
Apr  4 23:06:56 biocluster kernel: Process rm (pid: 12459, threadinfo
ffff81053e8b2000, task ffff810c73dad020)
Apr  4 23:06:56 biocluster kernel: Stack:  ffff810e2c4343a8
ffff81053e8b3a38 ffff81063849b884 ffffffff884080f3
Apr  4 23:06:56 biocluster kernel:  ffff81063849b8ac ffff810e2c4343b0
ffff81063849ba38 ffff810e2c4343b0
Apr  4 23:06:56 biocluster kernel:  0000000400000000 0000000000000000
0000000000000000 ffff81063849b884
Apr  4 23:06:56 biocluster kernel: Call Trace:
Apr  4 23:06:56 biocluster kernel:  [<ffffffff884080f3>]
:nfs:nfs4_xdr_enc_lookup+0x62/0x85
Apr  4 23:06:56 biocluster kernel:  [<ffffffff883a7424>]
:sunrpc:call_transmit+0x1c1/0x22d
Apr  4 23:06:56 biocluster kernel:  [<ffffffff883ac8c3>]
:sunrpc:__rpc_execute+0x7d/0x234
Apr  4 23:06:56 biocluster kernel:  [<ffffffff883a7b84>]
:sunrpc:rpc_call_sync+0x75/0x9c
Apr  4 23:06:56 biocluster kernel:  [<ffffffff80295cc9>] touch_atime+0xbe/0x101
Apr  4 23:06:56 biocluster kernel:  [<ffffffff884025bc>]
:nfs:nfs4_proc_lookup+0xe5/0x25c
Apr  4 23:06:56 biocluster kernel:  [<ffffffff80263fef>]
get_page_from_freelist+0x363/0x4de
Apr  4 23:06:56 biocluster kernel:  [<ffffffff883ee8c6>]
:nfs:nfs_lookup+0xf6/0x262
Apr  4 23:06:56 biocluster kernel:  [<ffffffff80289dcf>] do_lookup+0x63/0x1ae
Apr  4 23:06:56 biocluster kernel:  [<ffffffff80292764>] dput+0x1c/0x10b
Apr  4 23:06:56 biocluster kernel:  [<ffffffff80236355>]
current_fs_time+0x3b/0x40
Apr  4 23:06:56 biocluster kernel:  [<ffffffff883ada8c>]
:sunrpc:rpcauth_lookup_credcache+0x12e/0x24a
Apr  4 23:06:56 biocluster kernel:  [<ffffffff80295cc9>] touch_atime+0xbe/0x101
Apr  4 23:06:56 biocluster kernel:  [<ffffffff883ef179>]
:nfs:nfs_access_get_cached+0x24/0x11e
Apr  4 23:06:56 biocluster kernel:  [<ffffffff883f0892>]
:nfs:nfs_atomic_lookup+0x4b/0x18a
Apr  4 23:06:56 biocluster kernel:  [<ffffffff80289e30>] do_lookup+0xc4/0x1ae
Apr  4 23:06:56 biocluster kernel:  [<ffffffff8028bf87>]
__link_path_walk+0x8ec/0xd9d
Apr  4 23:06:56 biocluster kernel:  [<ffffffff8026979d>]
zone_statistics+0x3f/0x60
Apr  4 23:06:56 biocluster kernel:  [<ffffffff8028c490>]
link_path_walk+0x58/0xe0
Apr  4 23:06:56 biocluster kernel:  [<ffffffff80271045>]
do_mmap_pgoff+0x60a/0x776
Apr  4 23:06:56 biocluster kernel:  [<ffffffff80340296>] tty_ioctl+0x0/0xc52
Apr  4 23:06:56 biocluster kernel:  [<ffffffff8028c7fd>]
do_path_lookup+0x1a0/0x1c3
Apr  4 23:06:56 biocluster kernel:  [<ffffffff8028b316>] getname+0x14c/0x190
Apr  4 23:06:56 biocluster kernel:  [<ffffffff8028d02d>]
__user_walk_fd+0x37/0x53
Apr  4 23:06:56 biocluster kernel:  [<ffffffff80285fc5>] vfs_lstat_fd+0x18/0x47
Apr  4 23:06:56 biocluster kernel:  [<ffffffff80271045>]
do_mmap_pgoff+0x60a/0x776
Apr  4 23:06:56 biocluster kernel:  [<ffffffff80340296>] tty_ioctl+0x0/0xc52
Apr  4 23:06:56 biocluster kernel:  [<ffffffff8028e5b0>] do_ioctl+0xa4/0xb6
Apr  4 23:06:56 biocluster kernel:  [<ffffffff802861b9>] sys_newlstat+0x19/0x31
Apr  4 23:06:56 biocluster kernel:  [<ffffffff803f396d>] error_exit+0x0/0x84
Apr  4 23:06:56 biocluster kernel:  [<ffffffff80209d8e>] system_call+0x7e/0x83
Apr  4 23:06:56 biocluster kernel:
Apr  4 23:06:56 biocluster kernel:
Apr  4 23:06:56 biocluster kernel: Code: 0f 0b eb fe c7 00 00 00 00 0f
89 d8 48 8d 7a 08 0f c8 89 42
Apr  4 23:06:56 biocluster kernel:  RSP <ffff81053e8b38d8>
Apr  4 23:19:41 biocluster -- MARK --
Apr  4 23:39:41 biocluster -- MARK --


Thanks!

Alex

-- System Information:
Debian Release: 4.0
  APT prefers oldstable
  APT policy: (500, 'oldstable')
Architecture: amd64 (x86_64)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.22-3-amd64
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)

Versions of packages nfs-kernel-server depends on:
ii  lib 2.7-6                                GNU C Library: Shared libraries
ii  lib 1.39+1.40-WIP-2006.11.14+dfsg-2etch1 common error description library
ii  lib 0.10-4                               A mechanism-switch gssapi library
ii  lib 1.4.4-7etch6                         MIT Kerberos runtime libraries
ii  lib 0.18-0                               An nfs idmapping library
ii  lib 0.14-2etch3                          allows secure rpc communication us
ii  lib 7.6.dbs-13                           Wietse Venema's TCP wrappers libra
ii  lsb 3.1-23.2etch1                        Linux Standard Base 3.1 init scrip
ii  nfs 1:1.0.10-6+etch.1                    NFS support files common to client
ii  ucf 2.0020                               Update Configuration File: preserv

nfs-kernel-server recommends no packages.

-- no debconf information


-- 
------------------------------------------------------------
Aleksandr Levchuk
Biology Systems and Database Administrator
University of California, Riverside
Cell Phone: (951) 368-0004
------------------------------------------------------------



--- End Message ---
--- Begin Message ---
Version: 2.6.23-1

On Mon, 2010-04-05 at 16:13 -0700, Aleksandr Levchuk wrote:
> Hi Ben,
> 
> No, I haven't got a chance to check if the bug exists in newer version.
> We changed our NFS server from Linux to OpenSolaris.
> 
> But it was a major problem. It re-occurred every time a user would
> attempt a filesystem operation where the filename was very long (e.g.
> 500 characters). Any fs write operation (rm, create new file) would
> cause the kernel panic.
> 
> The crash happened several times a year. In all cases it was when
> someone would antecedently pass data instead of a filename to a peace
> of code that expects filenames.

It appears that this bug was fixed in Linux 2.6.23 by this change:

commit 54af3bb543c071769141387a42deaaab5074da55
Author: Trond Myklebust <Trond.Myklebust@netapp.com>
Date:   Fri Sep 28 12:27:41 2007 -0400

    NFS: Fix an Oops in encode_lookup()

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.

Attachment: signature.asc
Description: This is a digitally signed message part


--- End Message ---

Reply to: