[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Pre-approval request for dpkg sync() changes for squeeze



Hi!

On Sat, 2010-11-06 at 10:20:29 +0100, Raphael Hertzog wrote:
> On Sat, 06 Nov 2010, Guillem Jover wrote:
> > Finally, while accepting patch 2) alone might make sense, accepting 1)
> > alone would not.
> 
> BTW, I think we should go with patch 2 alone currently (i.e. just add
> --force-unsafe-io).
> 
> Continuing to use sync() instead of fsync() is the best compromise we have
> currently to have reasonable performance on Linux with all filesystems.

I disagree, as that does not seem to be the case currently anyway.

> I don't think that introducing a big performance degradation on ext4 users
> is a good idea at this point of the release cycle even if they have a
> work-around with --force-unsafe-io.

Installing/upgrading software is not something you usually do that often
on a stable system. And I'm planning to apply the switch to fsync()
for 1.16.0 so that will almost immediately affect testing/sid once the
freeze is over anyway.

> I would like however to see --force-unsafe-io so that we can have the best
> performance when we want it (in d-i, in temporary build chroots, etc.).

Well, I think it's completely unfair, that due to "design flaws" (some
might want to call that conformance to the spec) in a file system,
everyone has to suffer for it. We have to remember all this is mostly
for ext4 benefit after all. I'm always eager to do the right thing, in
this case using fsync() if it fixes real problems, but when the correct
and simple solution [0] which is also what was initially proposed and
expected (just look at man mount, auto_da_alloc, among others) by those
same who designed such system is impracticable on *that* file system,
then that's just not right.

 [0] Not to take into account the delayed rename() gymnastics we have
     had to do in dpkg, and possible future additional complexity,
     aio/fsync() in threads or subprocesses/etc?.

The worst that could happen on other file systems w/o the sync()/fsync()
before rename()s for extracted files was that the dpkg database might
get slightly out of sync relative to what was installed on disk, but
that's at most confusing, nothing compared to getting zero-length files
all over the place.

Switching to sync() was a mistake, a workaround that seemingly improved
things, but a workaround non the less with pernicious side effects.
Obviously --force-unsafe-io is also a workaround, one I'd rather not
have, but at least it's portable and has some utility outside ext4.

The zero-length problem should affect only new recently installed
systems with ext4 anyway. Those users might be ok with dpkg doing safe
I/O, but they will not be ok with say user data which can get lost as
easily, something I think is worse than system files. For that matter
using mostly any software (except few, probably database related) will
be dangerous, you just have to check around for the few projects using
fsync() at all! Not even rpm does for the extracted files.

Using nodelalloc as Sven pointed out, is IMO the only sane option for
those users if they value their data (also data=ordered). And something
that should probably be mentioned in the release notes anyway regardless
of any dpkg change. Or just recommend not using ext4 at all?

This and my other related mails might come across probably as quite
loaded, but the situation regarding ext4 does not leave much options
available, so I've been feeling between a rock and a hard place.

Anyway, I think I'm starting to repeat myself, and that I've exposed
my case, I'll leave the release team to deliberate, and in case they
decline I'll probably prepare dpkg backports to be placed somewhere.

thanks,
guillem


Reply to: