Re: Bug#605009: serious performance regression with ext4
On Fri, Nov 26, 2010 at 03:53:27PM +0100, Raphael Hertzog wrote:
> Just to sum up what dpkg --unpack does in 1.15.8.6:
> 1/ set the package status as half-installed/reinst-required
> 2/ extract all the new files as *.dpkg-new
> 3/ for all the unpacked files: fsync(foo.dpkg-new) followed by
> rename(foo.dpkg-new, foo)
What are you doing?
1) Suppose package contains files "a", "b", and "c". Which are you
doing?
a) extract a.dpkg-new ; fsync(a.dpkg-new); rename(a.dpkg-new, a);
extract b.dpkg-new ; fsync(b.dpkg-new); rename(b.dpkg-new, b);
extract c.dpkg-new ; fsync(c.dpkg-new); rename(c.dpkg-new, c);
or
b) extract a.dpkg-new ; fsync(a.dpkg-new);
extract b.dpkg-new ; fsync(b.dpkg-new);
extract c.dpkg-new ; fsync(c.dpkg-new);
rename(a.dpkg-new, a);
rename(b.dpkg-new, b);
rename(c.dpkg-new, c);
or
c) extract(a.dpkg-new);
extract(b.dpkg-new);
extract(c.dpkg-new);
fsync(a.dpkg-new);
fsync(b.dpkg-new);
fsync(c.dpkg-new);
rename(a.dpkg-new, a);
rename(b.dpkg-new, b);
rename(c.dpkg-new, c);
(c) will perform the best for most file systems, including ext4. As a
further optimization, if "b" and "c" does not exist, of course it
would be better to extract into "b" and "c" directly, and skip the
rename, i.e.:
d) extract(a.dpkg-new);
extract(b); # assuming the file "b" does not yet exist
extract(c); # assuming the file "c" does not yet exist
fsync(a.dpkg-new);
fsync(b);
fsync(c);
rename(a.dpkg-new, a);
... and then set the package status as unpacked.
I am guessing you are doing (a) today --- am I right? (c) or (d)
would be best.
- Ted
Reply to: