[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#807040: general: System hangs and then restarts (kernel panic)



Hi Sven,



On Mon, Dec 21, 2015 at 1:55 PM, Sven Joachim <svenjoac@gmx.de> wrote:
On 2015-12-11 08:20 +0100, Adam Borowski wrote:

> On Thu, Dec 10, 2015 at 07:17:43PM -0800, Nigra Truo wrote:
>> That does not work neither unfortunately. I installed the proprietary
>> driver and now X crashes. At least the whole machine does not crash, but I
>> can open a Desktop, KDE or Gnome, then open an app, maximize the window and
>> I get a prompt crash.
>
> Oif.  I'm afraid I can't help you further here -- I'm a mere user when it
> comes to X drivers.  I could help with installing a newer version or an
> alternate driver, but for more, we need actual driver guys :(

Problem is, there is no such person in Debian for the nouveau driver.

Who will fix this then when nobody is in charge then?
This problem is a huge disaster, I had about 25 restarts today already and can't work like this and will need to downgrade to have a stable system to work and troubleshoot this on the side on a test harddisk, as a test system as a system of this instable magnitude needs to be treated.

 

>> The unability to get logs in the Kernel Panic is a huge problem, I can't
>> believe that this his still not solved, that there is no automatic
>> mechanism, to at least see what caused the panic or, for the matter,
>> logging that ANY panic has occurred. Right now, the most serious of errors
>> does not have any accounting whatsoever.
 
>
> When the kernel panics, most of its facilities are considered dead.  Doing
> something as complex as a filesystem write would require temporarily
> ignoring the panic, with a huge risk of data corruption.  A generally pretty
> bad idea.

I don't know how Windows does it, but it does remember that there was a bluescreen. If Windows can, so can Linux.
Right now, I can't stop my system from restarting, I don't know what the hell is restarting it, if it is the watchdog service, which I deactivated in the kernel (and it still restarts), when I push some keys, it shows the kernel panic and shows the message "rebooting in 30 seconds" and when I do a google search, there is absolutely no reference or documentation about what that is, what is causing it and WHO is doing it, why the reboot?
As of Debian Wheezy, there as absolutely no restarting when the kernel paniced and that is the way it needs to be.
Now with the new kernel, it just restarts, and could easily go into a reboot loop, endlessly restarting. Also, it does not show anything in the logs, there is absolutely nothing on the system that shows the system had a panic or that it even crashed, which is totally horrendous, considering that you could have data loss and never know about it. I don't understand why this is solved so pathetically and half baked in the kernel.

I wrote a little tool some years ago, when I was astonished at this, that does a really "complex" (I'm purposefully sarcastic here) functionality, that can tell if the system crashed, by removing a file when shutting down gracefully. If the file is there still when starting up, the system crashed in a panic.
Now, 5 years later, I'm amazed and shocked that I seem to be the only person on earth that can actually tell that there has been a panic on my system.
Due to this annoying auto restart, transparency has been lost and your system could restart 40 times on a server and have massive data loss,  you would never know it.

 As you can see, I'm not happy about this whole mess. I now spend up to 20 hours troubleshooting this problem and am no closer to solve this except knowing that I will have to downgrade if I want to use my laptop to work again.


>
> Thus, you'd need to pass the remaining piece of the log somehow.  Ways to do
> so include:
> * a serial console.  My main desktop box happens to include a real serial
>   port, but that's sadly a rarity for modern machines these days.  There are
>   USB connectors which you could use to pass the logs from your laptop to
>   another machine.
> * kdump.  This keeps a whole secondary kernel in memory which takes over
>   during a crash and can do a post-mortem on the primary kernel which just
>   panicked.
> * some way over the network.  User-mode syslog won't work but there are
>   kernel-based ones, google says netdump.

The best choice is actually netconsole, see
https://mraw.org/blog/2010/11/08/Debugging_using_netconsole/ and
Documentation/networking/netconsole.txt in the kernel source
       Sven
Thanks for the link, I will check that out. I already tried to have a ssh link open and tail -f the kernel log, but the connection gets cut off right when the system crashes.

Markus



--
Por sperto kaj lerno ne sufiĉas eterno.

Reply to: