[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#837606: general: system freeze



On Wed, Sep 14, 2016 at 01:15:47PM +0300, Lars Wirzenius wrote:
> On Wed, Sep 14, 2016 at 11:23:56AM +0200, Abou Al Montacir wrote:
> > Because you think people will not be frustrated if they experience a bug and
> > that we prevent them to raise bugs? Hiding reality is always bad?. Look at the
> > original reporter last message. He seems quite disappointed by the project
> > reaction. He should feel as we don't care about our users. I personally
> > sometimes feel the same.
> 
> We do care about our users. However, due to the realities of volunteer
> projects, we need users to help us help them. Reporting a bug that
> "system freezes" isn't a problem that has an obvious solution: even
> assuming that we understand what "system freezes" actually means,
> there's not nearly enough informatino to figure out what causes it.

Even quite experienced people may have a hard time investigating a "system
freeze".

It just happens that I had two today; the system was working reliably before
with no unexplained crashes[1] at least in kernelly stuff.  Then out of a
sudden music gets stuck on a small buffer, screen freezes, no response
to anything, even no SysRq; ping worked for a short time then stopped. 
Half an hour later a repeat.  I've attached a serial console but it's
apparently a heisenbug -- no reproduces since.

But here's a little detail: two days ago I upgraded the kernel from 4.7.3 to
4.8-rc6 (good luck having an user mention that!).  I still don't know
anything more about a possible cause, though.

I'm not a super troubleshooter but at least I know where to stick a serial
console.  So, how would you proceed getting me to produce more information
for you?

When a system can't even write its logs, for a regular user that's it.  Even
if you managed to have the user try netconsole, it fails to work if there's
any bridging or VMs involved, on certain network cards, or if
(lspci;lsusb;...;pom)|md5sum - ends in a 0.  USB dies first in a crash, for
anything reliable you want serial on the sending side.  But, how often do
you find a serial port on new amd64, especially laptops?

> The thing is, a desktop system is a very complicated system. There's
> thousands of programs interacting, plus a lot of hardware, and a
> "general freeze" may be caused by pretty much any of them.

An user might call a freeze something that's a transient problem (like a
bout of swapping), or even a buggy full-screen program.  With that aside,
it's typically a kernel or X problem.

So let's assume the user, possibly with some help, did some investigation. 
But, how do you then know which package to make a report against?


Meow!



[1]. There _is_ an unsolved crasher in nouveau on my HW:
https://bugs.freedesktop.org/show_bug.cgi?id=79518
but at least it's isolated to a component that can be replaced (by nvidia
proprietary).
-- 
Second "wet cat laying down on a powered-on box-less SoC on the desk" close
shave in a week.  Protect your ARMs, folks!


Reply to: