Re: dpkg 1.15.6 is slow as hell
On Mon, Mar 15, 2010 at 05:30:01AM +0100, Guillem Jover wrote:
> On Fri, 2010-03-12 at 23:04:18 +0100, Raphael Hertzog wrote:
> > On Fri, 12 Mar 2010, Guillem Jover wrote:
> > > And in such case there's a high probability the files will be
> > > zero-length, which would be pretty bad for example Essential packages.
> > So let's do the fsync only for essential packages? It's a good compromise
> > IMO.
> I thought about that also, but it would need to do that for all of the
> pseudo-essential set too by checking if the package is a dependency
> (direct or indirect) of any of the Essential ones. It would be cheaper
> and easier to just fsync() for priority required ones though. But then
> neither of those cover packages as important as the kernel, the boot
> loader or the file system fsck, to name a few, which might render the
> system unbootable.
The problem is that fsync() is a hammer that's rather bigger than the
one dpkg needs. (Note that the following is essentially just a
restatement of one of the sides of last year's argument about ext4's
dpkg requires two things from this part of the code, in this order of
1) Unpacked files should contain either the old contents or the new
contents, never something in between.
2) After the state is changed to "unpacked", the files should contain
the new contents, even after a crash.
I contend that the second requirement is at least somewhat negotiable,
since dpkg never used to sync, but obviously it would be nice if we can
What dpkg does *not* need is:
3) After unpacking every file, wait until it has been written to disk.
But 3) is the principal thing fsync() gives us (modulo problems with
write caching on disks)! On filesystems that follow what Unix
applications historically always expected, rename() is sufficient to
Now, some people have been using filesystem configurations that don't
offer this guarantee on rename() (e.g. ext4 without data=ordered; I
understand that data=ordered is the default nowadays provided that the
journal file is of a version that can cope) - but this is essentially
because those filesystem configurations have been optimised to go as
fast as possible in benchmarks with certain applications, weakening a
reliability promise that admittedly was never guaranteed by POSIX. I
think there's a reasonable argument that those are not good filesystem
configurations to run dpkg on, but obviously people are doing so. The
question is just how far dpkg should go to support them. Obviously it
needs to provide its historical minimum guarantees, but how much more?
So, we need to examine what dpkg guarantees between a single file in a
package being unpacked, and the end of the unpack process (when the
state goes to "unpacked"). If there is a system crash in that period,
this has always been an invalid state: some of the files in the package
have been unpacked but not others, and packages are generally well
within their rights to fail when this is the case. You need to reunpack
the package to get the package database back into a sane state at this
point. Thus, in most cases it is actually not a big problem if you end
up with zero-sized files here after a system crash, as long as we try to
make sure that this doesn't happen once the package database is in a
sane state (i.e. the package has been fully unpacked), since you were
hosed to some extent anyway.
Sanity would be satisfied, on Linux, by sync()ing just before marking
the package "unpacked". (This would also assist with changes made by
preinst scripts, since anything written in shell typically can't fsync()
easily; another reason why filesystem configurations where rename() is
not a barrier are only suitable for carefully selected applications.)
I agree with Raphaël (as modified by you) that fsync()ing Priority:
required packages would be nice, since Essential packages are required
to work at all times, even when semi-unpacked - although I think you
could happily weaken that to fdatasync(), since you don't need things
like mtime updates to be synced atomically. I don't agree that we need
to do this for absolutely everything that might make the system
unbootable, particularly given the very severe performance degradation
that we're looking at here. It's sufficient, IMO, if we guarantee that
you can get in with a rescue disk and run dpkg.
dpkg has essentially been rediscovering what Firefox discovered, namely
that fsync() has a bad habit of hosing performance because in practice
you often end up waiting for far more than the single file you asked for
to sync out to disk, due to ordering requirements. Let's learn from
other people's mistakes and use a smaller hammer.
> > > In addition POSIX does not guarantee sync() will wait until the writes
> > > have finished (only Linux seems to be doing that though).
> > True, not sure what other systems are doing though.
> I checked GNU/Hurd and GNU/kFreeBSD were dpkg is used as a native
> package manager, and on both sync() does not wait. I don't know about
> OpenSolaris, Darwin, and others.
> For GNU/kFreeBSD:
> - freebsd/sys/kern/vfs_syscalls.c:sync(): Passes MNT_NOWAIT as flags.
> For GNU/Hurd, it can be changed by setting the file system translator
> fsysopts to --sync, but that's a global setting for all I/O operations:
> - glibc/sysdeps/mach/hurd/sync.c:sync(): Passes 0 as wait argument.
Don't those systems use filesystems that consider rename() as a barrier
in the traditional sense, anyway? I thought they did. I would say
that, if they don't, their maintainers can try to do something about it
in their own time; we're not making anything worse than it always has
Colin Watson [email@example.com]