Bug#542250: repeatable crashes while copying 500G from NFS mount to local logical volume
X-Loop
owner@bugs.debian.org: Resent-Date: Wed, 19 Aug 2009 03:24:04 +0000
Resent-Message-ID: <handler.542250.B542250.12506521281831@bugs.debian.org>
Resent-Sender: owner@bugs.debian.org
X-Debian-PR-Message: followup 542250
X-Debian-PR-Package: linux-image-2.6.26-2-xen-amd64
X-Debian-PR-Keywords:
X-Debian-PR-Source: linux-2.6
Received: via spool by 542250-submit@bugs.debian.org id=B542250.12506521281831
(code B ref 542250); Wed, 19 Aug 2009 03:24:04 +0000
Received: (at 542250) by bugs.debian.org; 19 Aug 2009 03:22:08 +0000
X-Spam-Checker-Version: SpamAssassin 3.2.3-bugs.debian.org_2005_01_02
(2007-08-08) on rietz.debian.org
X-Spam-Level:
X-Spam-Bayes: score:0.0000 Tokens: new, 40; hammy, 148; neutral, 66; spammy,
3. spammytokens:1.000-9--H*m:136, 0.993-1--assertions, 0.992-+--H*MI:136
hammytokens:0.000-+--H*c:protocol, 0.000-+--H*c:micalg, 0.000-+--H*c:signed,
0.000-+--H*c:pgp-signature, 0.000-+--H*f:sk:2009081
X-Spam-Status: No, score=-6.0 required=4.0 tests=AWL,BAYES_00,FOURLA,
FVGT_m_MULTI_ODD,HAS_BUG_NUMBER,MURPHY_DRUGS_REL8 autolearn=ham
version=3.2.3-bugs.debian.org_2005_01_02
Received: from shadbolt.e.decadent.org.uk ([88.96.1.126])
by rietz.debian.org with esmtp (Exim 4.63)
(envelope-from <ben@decadent.org.uk>)
id 1MdbkW-0000TE-92
for 542250@bugs.debian.org; Wed, 19 Aug 2009 03:22:08 +0000
Received: from deadeye.i.decadent.org.uk ([192.168.4.185] helo=localhost)
by shadbolt.decadent.org.uk with esmtp (Exim 4.69)
(envelope-from <ben@decadent.org.uk>)
id 1MdbkT-0005PD-FA
for 542250@bugs.debian.org; Wed, 19 Aug 2009 04:22:06 +0100
Received: from womble by localhost with local (Exim 4.69)
(envelope-from <ben@decadent.org.uk>)
id 1MdbkS-00012B-Ij
for 542250@bugs.debian.org; Wed, 19 Aug 2009 04:22:04 +0100
From: Ben Hutchings <ben@decadent.org.uk>
To: 542250@bugs.debian.org
In-Reply-To: <[🔎] 1250646426.16001.81.camel@localhost>
References: <[🔎] 20090818163840.6472.70854.reportbug@desktopvm.lvknet>
<[🔎] 1250646426.16001.81.camel@localhost>
Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="=-gpJGQvNhH5HmU60cBDtr"
Date: Wed, 19 Aug 2009 04:22:04 +0100
Message-Id: <1250652124.16001.136.camel@localhost>
Mime-Version: 1.0
X-Mailer: Evolution 2.26.3
X-SA-Exim-Connect-IP: 192.168.4.185
X-SA-Exim-Mail-From: ben@decadent.org.uk
X-SA-Exim-Version: 4.2.1 (built Wed, 25 Jun 2008 17:14:11 +0000)
X-SA-Exim-Scanned: Yes (on shadbolt.decadent.org.uk)
--=-gpJGQvNhH5HmU60cBDtr
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable
On Wed, 2009-08-19 at 02:47 +0100, Ben Hutchings wrote:
[...]
> The kernel was spinning in process context, was interrupted by the SATA
> device, and its interrupt handler also started spinning.
>=20
> The BUG line is here:
>=20
> /* announce we're spinning */
> spinning =3D &__get_cpu_var(spinning);
> if (spinning->lock) {
> BUG_ON(spinning->lock =3D=3D lock);
> if(raw_irqs_disabled()) {
> BUG_ON(__get_cpu_var(spinning_bh).lock =3D=3D lock);
> spinning =3D &__get_cpu_var(spinning_irq);
> } else {
> -> BUG_ON(!in_softirq());
> spinning =3D &__get_cpu_var(spinning_bh);
> }
> BUG_ON(spinning->lock);
> }
> spinning->ticket =3D token;
> smp_wmb();
> spinning->lock =3D lock;
>=20
> This asserts that if we spin on a lock after interrupting another spin,
> and interrupts are enabled, we must be in a softirq.
>=20
> This seems bogus to me - in general, interrupts are enabled during
> interrupt handlers once their specific IRQ has been masked.
>=20
> I'll have a look at whether & how this code has changed upstream and in
> other forward-ported branches.
In the SLE 11.0 branch (downloaded from SUSE KOTD:
<ftp://ftp.suse.com/pub/projects/kernel/kotd/>) xen_spin_wait() allows
for arbitrarily nested spinlocks and has no such assertions.
The XCI tree
(<http://xenbits.xen.org/git-http/xenclient/linux-2.6.27.git> with patch
queue <http://xenbits.xen.org/git-http/xenclient/linux-2.6.27-pq.git>)
matches SLE 11.0.
This rather suggests that the assertions are wrong and we need to change
them.
Ben.
--=20
Ben Hutchings
The generation of random numbers is too important to be left to chance.
- Robert Coveyo=
u
--=-gpJGQvNhH5HmU60cBDtr
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: This is a digitally signed message part
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iD8DBQBKi2/R79ZNCRIGYgcRAsOlAJ4k4t0kax1BU5DqzSctduNUr5Nv4ACfbqRy
3XgtgQCjO0TCKyuWEchVWAA=
=LaKK
-----END PGP SIGNATURE-----
--=-gpJGQvNhH5HmU60cBDtr--
Reply to: