[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

(solved) Re: Debian 10 freezes upon shutdown, reboot and logout




Short answer:

# apt-get install nvidia-detect

and then you install the driver it recommends after detection.

That is it.

----

alternative version:

the driver nouveau is broken (for my system). It not only hangs the system during the tasks on the subject of this email, but also during init 1, init 3, a simple alt+f1 to get a tty, etc. And sometimes doing nothing, just using the computer (say, gimp, editing an image).

To test without the necessity to change the system and risk a "blank screen", I went in the BIOS and disabled the nvidia graphic card, letting only the intel_integrated card running.

The system booted flawlessly, with no nouveau driver. Worked just fine. Then I just installed nvidia proprietary on top of it, getting nouveau automatically removed in the process.

A few boots, reboots, shutdowns and logouts after, just to be sure, and voilà! The system is perfect.

Cheers,

Beco



On Mon, 30 Sep 2019 at 00:31, Beco <rcb@beco.cc> wrote:
Some updates on testing I'm doing:

$ init 1
also hangs

Booting from grub using init 3 gave me a single chance to reboot without hanging, other variables being acpi=off. But somehow I was not able to reproduce the behaviour.

Messages appearing on screen when init 3 was running, or in kernel.log:

---
TTM Buffer eviction failed
nouveau DRM failed idle channel 0
---


Now the messages below appear while the shutting down (or rebooting) is running its course, so I have no terminal at hand. Just the notifications scrolling:
(I took a picture with a mobile and wrote them by hand. Forgive any typos or abbreviations)

after init 1 or reboot or shutdown actually

INFO: rcu_sched detected stalls on CPUs/tasks:
rcu: $3-...0: (0 ticks this GP) idle=51a/1/0x40000
rcu: $(detected by 6, t=5252 jiffies, g...)
NMI watchdog: Watchdog detected hard LOCKUP on cpu 3
INFO: rcu_sched detected expedited stalls on CPUs/tasks:
rcu: blocking rcu_node structures:
watchdog: BUG: soft lockup - CPU#5 stuck for 23s! [systemd:1]
...
repeats in an endless loop
...

INFO: task irq/87-ELANO611:556 blocked for more than 120 seconds
    Tainted: G    W    OEL    4.19.0-6-amd64 #1 Debian 4.19.67-2+deb10u1
echo 0 > /proc/sys/kernel/hung_task_timeout_secs disables this message
...
repeats for other tasks (haveged, systemd-logind, wpa_supplicant, dhclient, QQm1Thread, GlobalQueue, gdbus, ...)
...



i2c_transfer+0x51...
elan_i2c_get_report+0x1c...
? __switch_to+0x8c/0x440...
elan_isr+0x4b...
? __schedule+0x2aa...
? __wake_up_common_lock+...
? irq_finalize_oneshot....
irq_thread_fn+0x1f...
kthread+0x112
ret_from_fork+...
watchdog: BUG: soft lockup...
Modules linked in: ufs qnx4 hfsplus hfs minix vfat msdos fat jfs xfs dm_mod pc sst_ssp snd_hda_ext_core snd_soc_acpi_intel_match x86_pkg_temp_thermal btbcm irqbypass videobuf2_vmalloc btintel ideapad_l... pcspkr xor btrfs ecb zstd_decompre... xtables autofs4 media pcc_cpufreq...
CPU: 5 PID:: 1 Comm: systemd Tainted: G
Hardware name: LENOVO 81G3/LNVNB161216, BIOS 6JCN23WW 01/23/2018
RIP: 0010:smp_call_function_many+0x1f8/0x250
Code: c7 e8 6c 3f 5f 00...
RAX: 0000000000000003 RBX: ffff9e.... RCX: fff...
RDX: 0000000000000001 RSI: 0000.... RDI: fff9e...
RBP: .... R08: .... R09:....
R10:...  R11:... R12:...
R13:... R14:.... R15:...
FS:... GS:... knlGS:...
CS: ... DS: ... ES: ... CRO:....
CR2: .... CR3:... CR4:....
Call Trace:
? tcp_v6_pre_connect...
? add_nops...
on_each_cpu...
text_poke_bp
__jump_label_transform...
arch_jump_label...
__jump_label_update...
__static_key_slow_dec_cpu...
__cgroup_bpf_detach...
__cgroup_bpf_prog_detach...
__x64_sys_bpf...
do_syscall_64+0x53/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:....
...
repeat, different memory dump
...
Task dump for CPU 3:
Xorg     R running task
Call Trace:
? nvif_object_fini...
?nouveaus_vmm_fini
?nouveau_cli_fini
?nouveau_drm_postcl...
?drm_file_free.part....
?drm_release+...
?__fput
?task_work_run...
?do_exit
?handle_mm_fault
?do_group_exit
?__x64_sys_exit_group...
do_syscall_64+0x53/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9


--------------


I would prefer to read that on logs instead of a mobile picture... Not sure yet what logs to look (or maybe turn on)

dmesg shows only the starting process, not the hanging avenue when trying to shutdown.

Thanks for any help or tip.

Att.,
Beco






On Sun, 29 Sep 2019 at 21:27, Beco <rcb@beco.cc> wrote:

Hi,

I'm currently having issues with a LENOVO ideapad320.

Using Stretch was smooth. But this weekend I've updated to Buster and I'm having trouble to shutdown the system.
Rebooting also freezes.

Watchdog says the CPU number #something is 22 seconds froze.

I usually don't use "logout" since it is my personal laptop and I only uses KDE, but I've decided to give i3wm a try and because of that I discovered that not only shutdown and reboot hangs the system, but also logout.

I need to finish shutdown with SysReq commands everytime to sync, umount and turn off.

Not sure what logs I need to look, but kern.log shows nouveau driver having problems. Not much to go after, but may be a tip.
Also, I've tried to add ACPI=force to grub, just in case, to test. Nothing changed.

A small detail: my var partition is separated from the root. I'm telling this because in the old system ever reboot gave me a "unable to umount var". But it was ok. When I installed Stretch in 2018 I researched the problem and it was only a warning from journal.conf that could be solved by using "volatile". Anyway, I even considered to move /var to the root just to test, but after reading more about the problem I decided not to pursue this way.

It must be something else... Maybe I could try to remove nouveau just to test. But since removing and adding nouveau is really hard, and the video is working great, I want to check with you guys first for more fresh ideas.

Has anyone had trouble shutting down Buster? What other options do I have to try to find the problem and a possible solution? Any other logs may be of interest?

Thanks.

Bèco,
-- Linux user since it was called "just a hobby" (by L. Torvald)



--
Dr Beco
A.I. researcher

"I know you think you understand what you thought I said but I'm not sure you realize that what you heard is not what I meant" -- Alan Greenspan

Creation date: pgp.mit.edu ID as of 2014-11-09


--
Dr Beco
A.I. researcher

"I know you think you understand what you thought I said but I'm not sure you realize that what you heard is not what I meant" -- Alan Greenspan

Creation date: pgp.mit.edu ID as of 2014-11-09


--
Dr Beco
A.I. researcher

"I know you think you understand what you thought I said but I'm not sure you realize that what you heard is not what I meant" -- Alan Greenspan

Creation date: pgp.mit.edu ID as of 2014-11-09

Reply to: