[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#605009: serious performance regression with ext4



On Fri, Nov 26, 2010 at 03:53:27PM +0100, Raphael Hertzog wrote:
> Just to sum up what dpkg --unpack does in 1.15.8.6:
> 1/ set the package status as half-installed/reinst-required
> 2/ extract all the new files as *.dpkg-new
> 3/ for all the unpacked files: fsync(foo.dpkg-new) followed by
>    rename(foo.dpkg-new, foo)

What are you doing?

1) Suppose package contains files "a", "b", and "c".  Which are you
doing?

a)  extract a.dpkg-new ; fsync(a.dpkg-new); rename(a.dpkg-new, a);
    extract b.dpkg-new ; fsync(b.dpkg-new); rename(b.dpkg-new, b);
    extract c.dpkg-new ; fsync(c.dpkg-new); rename(c.dpkg-new, c);

or

b)  extract a.dpkg-new ; fsync(a.dpkg-new);
    extract b.dpkg-new ; fsync(b.dpkg-new);
    extract c.dpkg-new ; fsync(c.dpkg-new);
    rename(a.dpkg-new, a);
    rename(b.dpkg-new, b);
    rename(c.dpkg-new, c);

or

c)  extract(a.dpkg-new);
    extract(b.dpkg-new);
    extract(c.dpkg-new);
    fsync(a.dpkg-new);
    fsync(b.dpkg-new);
    fsync(c.dpkg-new);
    rename(a.dpkg-new, a);
    rename(b.dpkg-new, b);
    rename(c.dpkg-new, c);


(c) will perform the best for most file systems, including ext4.  As a
further optimization, if "b" and "c" does not exist, of course it
would be better to extract into "b" and "c" directly, and skip the
rename, i.e.:

d)  extract(a.dpkg-new);
    extract(b);			# assuming the file "b" does not yet exist
    extract(c);			# assuming the file "c" does not yet exist
    fsync(a.dpkg-new);
    fsync(b);
    fsync(c);
    rename(a.dpkg-new, a);

... and then set the package status as unpacked.

I am guessing you are doing (a) today --- am I right?  (c) or (d)
would be best.

						- Ted


Reply to: