[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#592463: marked as done (xen-linux-system-2.6.26-2-xen-amd64: domU kernel freeze on xfs location)



Your message dated Sun, 2 Dec 2012 16:37:13 -0800
with message-id <20121203003713.GA18273@elie.Belkin>
and subject line [Mailer-Daemon@mx1.workshopit.co.uk: Mail delivery failed: returning message to sender]
has caused the Debian Bug report #592463,
regarding xen-linux-system-2.6.26-2-xen-amd64: domU kernel freeze on xfs location
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
592463: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=592463
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: xen-linux-system-2.6.26-2-xen-amd64
Severity: important


Hello,

Yesterday we had a domU PV host freeze up in a file location. The following log was received:

    [20108008.237603] BUG: soft lockup - CPU#0 stuck for 61s! [smbd:16411]
    [20108008.237603] Modules linked in: appletalk ppdev parport_pc lp parport ipv6 xfs evdev ext3 jbd mbcache thermal_sys
    [20108008.237603] CPU 0:
    [20108008.237603] Modules linked in: appletalk ppdev parport_pc lp parport ipv6 xfs evdev ext3 jbd mbcache thermal_sys
    [20108008.237603] Pid: 16411, comm: smbd Not tainted 2.6.26-2-xen-amd64 #1
    [20108008.237603] RIP: e030:[<ffffffffa0072d36>]  [<ffffffffa0072d36>] :xfs:xfs_iext_get_ext 0xa/0x5a
    [20108008.237603] RSP: e02b:ffff8800534dfa30  EFLAGS: 00000202
    [20108008.237603] RAX: 000000000000008d RBX: ffff8800534dfbe8 RCX: 000000000000008d
    [20108008.237603] RDX: ffff8800534dfc30 RSI: 000000000000008c RDI: ffff88006436bb60
    [20108008.237603] RBP: ffff88008d43c8d0 R08: 000000000000008d R09: 0000000000000100
    [20108008.237603] R10: ffff8800fdba73c0 R11: 0000000000000000 R12: ffff8800534dfbc8
    [20108008.237603] R13: ffff88006436bb60 R14: ffff8800534dfc30 R15: ffff8800534dfc2c
    [20108008.237603] FS:  00007f4198b83700(0000) GS:ffffffff8053a000(0000) knlGS:0000000000000000
    [20108008.237603] CS:  e033 DS: 0000 ES: 0000
    [20108008.237603] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [20108008.237603] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    [20108008.237603]
    [20108008.237603] Call Trace:
    [20108008.237603]  [<ffffffffa00550d7>] ? :xfs:xfs_bmap_search_multi_extents 0x78/0xda
    [20108008.237603]  [<ffffffffa0055194>] ? :xfs:xfs_bmap_search_extents 0x5b/0xe6
    [20108008.237603]  [<ffffffffa005b1df>] ? :xfs:xfs_bmapi 0x26e/0xf76
    [20108008.237603]  [<ffffffff80436b47>] ? error_exit 0x0/0x69
    [20108008.237603]  [<ffffffff80436b47>] ? error_exit 0x0/0x69
    [20108008.237603]  [<ffffffffa0096441>] ? :xfs:xfs_zero_eof 0xc0/0x16a
    [20108008.237603]  [<ffffffffa0096b0e>] ? :xfs:xfs_write 0x344/0x722
    [20108008.237603]  [<ffffffff8028a1ef>] ? do_sync_write 0xc9/0x10c
    [20108008.237603]  [<ffffffff8020e7bc>] ? get_nsec_offset 0x9/0x2c
    [20108008.237603]  [<ffffffff802992dc>] ? __posix_lock_file 0x3c1/0x3f6
    [20108008.237603]  [<ffffffff8023f6c1>] ? autoremove_wake_function 0x0/0x2e
    [20108008.237603]  [<ffffffff8028a999>] ? vfs_write 0xad/0x156
    [20108008.237603]  [<ffffffff8028b024>] ? sys_pwrite64 0x50/0x70
    [20108008.237603]  [<ffffffff802964a2>] ? sys_fcntl 0x2eb/0x2f7
    [20108008.237603]  [<ffffffff8020b528>] ? system_call 0x68/0x6d
    [20108008.237603]  [<ffffffff8020b4c0>] ? system_call 0x0/0x6d
    [20108008.237603]

This domU could not be rebooted, and had to be xm destroy then recreated again before the filesystem was accessible again. This machine has been running successfully for over 6 months before this issue occurred, and we run a number of other 2.6.26-2-xen machines that have not had this issue (fingers crossed!). If there is any explanation of this, how to avoid, or if it has been fixed in a newer kernel, this would be ideal.

Thanks


-- System Information:
Debian Release: 5.0.5
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.26-2-xen-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash



--- End Message ---
--- Begin Message ---
--- Begin Message ---
This message was created automatically by mail delivery software.

A message that you sent could not be delivered to one or more of its
recipients. This is a permanent error. The following address(es) failed:

  mark@campbell-lange.net
    SMTP error from remote mail server after RCPT TO:<mark@campbell-lange.net>:
    host 93.191.37.68 [93.191.37.68]: 550 unknown user

------ This is a copy of the message, including all the headers. ------

Return-path: <jrnieder@gmail.com>
Received: from mail-da0-f42.google.com ([209.85.210.42])
	by mx1.workshopit.co.uk with esmtp (Exim 4.72)
	(envelope-from <jrnieder@gmail.com>)
	id 1TfJv1-0003Vs-HE
	for mark@campbell-lange.net; Mon, 03 Dec 2012 00:29:55 +0000
Received: by mail-da0-f42.google.com with SMTP id z17so888393dal.15
        for <mark@campbell-lange.net>; Sun, 02 Dec 2012 16:32:47 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20120113;
        h=date:from:to:cc:subject:message-id:references:mime-version
         :content-type:content-disposition:in-reply-to:user-agent;
        bh=ozT3COOiOPcm8qcLv+wtFes18JM4o4ZySnV2qU00/tk=;
        b=TECGv1xT4FPpK1kkuXe+wp2nr5kfg2koONaLTHyu7SlKFji3xQecy8jIAYBsaFxfUu
         NgoEfoENdpZWW1Xt5S5KYteXcBxfSa7pHo2KYcJ6MT0jr/pN2JG2LCtTp5s6oqxruteM
         anz58dj1JHLVS4fHfz9Sot+s/G/Kd33V52d9CKSUskUsVIzSoJJQCh3KNnPiY+1mQE70
         Id1t2+jMBvV1grDJ2Sbae5aotLmLRVgTn3eGqEw5Z4lldE83rNaiDizIOxWPJYombvx5
         dxA3iPcZO3/zJhqfcA4/PlcEcB+Qk+vQIFU7ltxk0arVpYe0xTeunYKhDJiKFd/nwEz8
         hing==
Received: by 10.68.190.38 with SMTP id gn6mr24719731pbc.6.1354494767427;
        Sun, 02 Dec 2012 16:32:47 -0800 (PST)
Received: from elie.Belkin (c-67-180-61-129.hsd1.ca.comcast.net. [67.180.61.129])
        by mx.google.com with ESMTPS id ip8sm7022453pbc.36.2012.12.02.16.32.45
        (version=SSLv3 cipher=OTHER);
        Sun, 02 Dec 2012 16:32:46 -0800 (PST)
Date: Sun, 2 Dec 2012 16:32:37 -0800
From: Jonathan Nieder <jrnieder@gmail.com>
To: Mark Adams <mark@campbell-lange.net>
Cc: 592463@bugs.debian.org
Subject: Re: [lenny] domU kernel freeze on xfs location
Message-ID: <[🔎] 20121203003237.GA18321@elie.Belkin>
References: <20100810092543.5843.66685.reportbug@wwfile.westonwilliamson.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20100810092543.5843.66685.reportbug@wwfile.westonwilliamson.com>
User-Agent: Mutt/1.5.21+51 (9e756d1adb76) (2011-07-01)
X-WorkshopIT-MailScanner-ID: 1TfJv1-0003Vs-HE
X-WorkshopIT-MailScanner: Found to be clean
X-WorkshopIT-MailScanner-From: jrnieder@gmail.com
X-Spam-Status: No

Hi,

In 2010, sysadmin@campbell-lange.net wrote:

> Yesterday we had a domU PV host freeze up in a file location. The
> following log was received:
>
>     [20108008.237603] BUG: soft lockup - CPU#0 stuck for 61s! [smbd:16411]
[...]
>     [20108008.237603] Pid: 16411, comm: smbd Not tainted 2.6.26-2-xen-amd64 #1
[...]
>     [20108008.237603] Call Trace:
>     [20108008.237603]  [<ffffffffa00550d7>] ? :xfs:xfs_bmap_search_multi_extents 0x78/0xda
>     [20108008.237603]  [<ffffffffa0055194>] ? :xfs:xfs_bmap_search_extents 0x5b/0xe6
>     [20108008.237603]  [<ffffffffa005b1df>] ? :xfs:xfs_bmapi 0x26e/0xf76
>     [20108008.237603]  [<ffffffff80436b47>] ? error_exit 0x0/0x69
>     [20108008.237603]  [<ffffffff80436b47>] ? error_exit 0x0/0x69
>     [20108008.237603]  [<ffffffffa0096441>] ? :xfs:xfs_zero_eof 0xc0/0x16a
>     [20108008.237603]  [<ffffffffa0096b0e>] ? :xfs:xfs_write 0x344/0x722
>     [20108008.237603]  [<ffffffff8028a1ef>] ? do_sync_write 0xc9/0x10c
>     [20108008.237603]  [<ffffffff8020e7bc>] ? get_nsec_offset 0x9/0x2c
>     [20108008.237603]  [<ffffffff802992dc>] ? __posix_lock_file 0x3c1/0x3f6
[...]
> This domU could not be rebooted, and had to be xm destroy then
> recreated again before the filesystem was accessible again.

Thanks for reporting it, and sorry for the slow response.

Did this only happen once, or was it reproducible?  If it happened
again, on which kernel?

Sincerely,
Jonathan

--- End Message ---

--- End Message ---

Reply to: