[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1057843: (no subject)



If it helps people, this is what I did on systems that automatically had rebooted into the problematic kernel.

First, I uninstalled the 6.1.0-14 kernel and rebooted back into 6.1.0-13.

Then I used `last` to identify the time between the problematic reboot into 6.1.0-14 and the deliberate reboot (back in to 6.1.0-13)

This showed me that the 'problem time' was between 2023-12-10 06:00:00 and 2023-12-10 19:00:00 (UTC).

>From there, I ran the followin gcommand to show me all files that were modified more or less between that time. I ignore a bunch of things I don't care about such as /var/log and other volatile parts of the filesystem.

find / -type f -newermt "2023-12-10 05:59:00" \! -newermt "2023-12-10 19:00:00" | egrep -v "/proc|/run|/sys|/var/log"

At least this gave me a somewhat small subset of files to manually check, which made it feel less daunting. Naturally it depends on your filesystem what files might've changed.

I was fortunate that none of the client applications I use, seem to use O_DIRECT, so I found no corrupted files (so far). 

Note that use of O_DIRECT is not a system-wide setting (e.g not one in /etc/fstab for the ext4 filesystem), it's something that each application can choose to use when working with files. For example, I have changed MySQL's innodb_flush_method to be O_DIRECT in the past (for performance), but it's not the default.

Out of the box, things like postgresql use fsync (not direct IO) by default.

I used some tools like 'git fsck' in git repositories that had changed during my 'problematic' time, and there were no issues - hopefully git does not use O_DIRECT.

I have not been able to find any definitive list of programs that use O_DIRECT out-of-the box. Maybe someone else will come up with such a list (if there even *are* programs that do so).

As others have said here, things like fsck won't likely help you, unfortunately. The nature of the bug is not one that corrupts the journaling/filesystem structure, it is more about the *contents* of the file, which fsck can't comment on.


Finally, I wanted to note: if you did `apt purge linux-image-6.1.0-14-amd64`, you might need to `apt install linux-image-amd64` (the meta package) before you can successfully pick up the new linux-image-6.1.0-15-amd64 automatically as a dependency (say, when doing apt update; apt dist-upgrade). At least, I needed to, as I think the purge automatically removed the meta package, leaving me with no *automatic* upgrade to the new kernel via those commands.

Good luck!


Reply to: