[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#636797: linux-image-2.6.32-5-amd64: avoid divide-by-zero ("divide error: 0000") in scheduler



On Sun, 2011-08-07 at 17:28 -0400, Daniel Kahn Gillmor wrote:
> Hi Ben--
> 
> Thanks for the quick followup!
> 
> On 08/07/2011 12:36 PM, Ben Hutchings wrote:
> > On Fri, 2011-08-05 at 18:36 -0400, Daniel Kahn Gillmor wrote:
> >> We've applied the attached patch (a simple workaround to ensure no
> >> division-by-zero) to the debian packages for several weeks in production
> >> (over a month on some machines) and haven't seen a recurrence of the
> >> problem.
> >
> > This doesn't really fix the bug - division by zero is just a symptom of
> > a more fundamental problem which has yet to be identified.
> 
> yep, that's why i called it a workaround :)
> 
> > As a result,
> > it hasn't been accepted upstream and won't be accepted in Debian.
> > 
> > That said, I would consider applying a variant that WARNs before 'fixing
> > up' the zero divisor, as a *temporary* measure to aid in understanding
> > the bug (more like
> > <https://bugzilla.kernel.org/show_bug.cgi?id=16991#c13>).
> 
> That sounds reasonable to me.  Are you up for preparing such a patch or
> do you need me to do it?

I'm quite busy so if you could try to do it that would be helpful.

> > I notice your 'oops' messages show 'Tainted: G W' which indicates there
> > was an earlier kernel warning.  What was the previous warning?
> 
> hmm, we've seen this on multiple machines, and they didn't all have a
> prior warning.  in the referenced machine, though, it was 5 months
> previously, a netdev watchdog timeout.  It doesn't seem related to me,
> but i'm happy to include the dump here in case anyone else can extract
> meaning from it:
[...]

Agreed.

Ben.

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: