[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#964839: Debugging results



Hi,

Thanks for the notes. I did not install the 5.8.7-kernel because it did not automatically arrive via apt (and I am not sure which packages exactly I should install). So I went directly into trying to bisecting:

Bisecting did not help at all. The first few kernels generated by the commits indicated by bisect did not boot at all (loading the initramfs hangs). So as I already had a linux kerrnel I tried compiling the versions that I tested previously (just to see if they would boot and show the same problem): 5.6.14 and 5.7.6. That worked.

So I went with the 5.7.6-kernel and added printk-calls to understand what happens:

The mechanism to send the notifications on PM_SUSPEND allows many modules to register handlers. Adding prints there showed, that the handler was not "completely" broken. Handlers from other modules were invoked. However the mechanism that allows handlers to "break" the notification-chain and preventing all subsequent handlers to be executed. This is done, by returning "NOTIFY_STOP" from the handler. Searching the diff between 5.6.14 and 5.7.6 for that constant showed a suspicious line at the end of hci_suspend_notifier in net/bluetooth/hci_core.c:

return ret ? notifier_from_errno(-EBUSY) : NOTIFY_STOP;

This line was introduced with 9952d90ea2885d7cbf80cd233f694f09a9c0eaec (which is in 5.7.6 and not in 5.6.14).

The handler was modified and improved upon between 5.7.6 and 5.8.7, but is still present in 5.8.7, where the handler always returns NOTIFY_STOP. In master the bug seems resolved, as there the handler returns (correctly) NOTIFY_DONE. The fix was made in 24b065727ceba53cc5bec0e725672417154df24f. Currently that commit is only contained in the tags: v5.9-rc1, v5.9-rc2, v5.9-rc3. To verify that this is indeed the problem I took the 5.7.6-kernel and adjusted:

return ret ? notifier_from_errno(-EBUSY) : NOTIFY_STOP;

to


return ret ? notifier_from_errno(-EBUSY) : NOTIFY_DONE;

which gave me a working 5.7.6-kernel.

Will you try to backport a fix or do I just need to wait until 5.9 for a kernel that might handle suspend properly again?

--
Kind regards,
Felix Dörre


Reply to: