[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [LONG] Re: "Exec format error" bugs



2010/2/8 Guillem Jover <guillem@debian.org>:
> I'd rather fix the problem that's causing those files to be 0 length.
> That should generally never happen, I'm assuming they might just need an
> fsync on the directory, which we are not doing at all in general, there
> might be some fsyncs on files missing too. What's the difference in
> Ubuntu that causes all these reporters to suffer such error, I had never
> seen that one before, and it's not been reported in our BTS either. Are
> all those reporters using ext4 or ubifs? Anything else different from
> Debian you might be aware of?
>

This is a problem with ext4 and some missing fsync.
I'm able to reproduce it on a VM with an ext4 fs and the following test:

# apt-get install hello; sleep 20; echo b > /proc/sysrq-trigger
[simulates a system crash]
After reboot both installation and removal scripts are 0 bytes. You
will notice that hello.list was correctly written to disk.
$ ls -l /var/lib/dpkg/info/hello.*
-rw-r--r-- 1 root root 323 2010-02-09 00:42 /var/lib/dpkg/info/hello.list
-rwxr-xr-x 1 root root   0 2009-08-15 19:17 /var/lib/dpkg/info/hello.postinst
-rwxr-xr-x 1 root root   0 2009-08-15 19:17 /var/lib/dpkg/info/hello.prerm

If you replay the test but adding a sync before the system crash:
# apt-get install hello; sync;  echo b > /proc/sysrq-trigger
After reboot the files are fine:
$ ls -l  /var/lib/dpkg/info/hello.*
-rw-r--r-- 1 root root 323 2010-02-09 00:46 /var/lib/dpkg/info/hello.list
-rwxr-xr-x 1 root root 103 2009-08-15 19:17 /var/lib/dpkg/info/hello.postinst
-rwxr-xr-x 1 root root  74 2009-08-15 19:17 /var/lib/dpkg/info/hello.prerm

If I adjust /proc/sys/vm/dirty_expire_centisecs to be below the sleep
time, ( for exemple 1000 in the test above ) then data are correctly
written to disk.

So, some fsyncs should fix it.

--
Sincerely,

- JB


Reply to: