[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Safe File Update (atomic)

On Thu, 30 Dec 2010, Olaf van der Spek wrote:
> On Thu, Dec 30, 2010 at 12:46 PM, Henrique de Moraes Holschuh
> <hmh@debian.org> wrote:
> >  write temp file (in same directory as file to be replaced), fsync temp
> What if the target name is actually a symlink? To a different volume?

Indeed. You have to check that first, of course :-(  This is about safe
handling of such functions, symlinks always have to be derreferenced and
their target checked.  After that, you operate on the target, if the symlink
changes, your operations will not.

> What if you're not allowed to create a file in that dir.

You fail the write.  Or the user has to request the unsafe handling
(truncate + write).  Or you have to detect it will happen and switch modes
if you're allowed to.

> > If we could use some syscall to make [1] into a simple barrier request
> > (guaranteed to degrade to fsync if barriers are not operating), it would
> > be better performance-wise.  This is what one should request of libc and
> > the kernels with a non-zero chance of getting it implemented (in fact,
> > it might even already exist).
> My proposal was O_ATOMIC:
> // begin transaction
> open(fname, O_ATOMIC | O_TRUNC);
> write; // 0+ times
> close;
> Seems like the ideal API from the app's point of view.

POSIX filesystems do not support it, so you'd need glibc to do everything
your application would have to get that atomicity.  I.e. it should go in a
separate lib, anyway, and you will have to code for it in the app :(

It is not transparent.  It cannot be.  What about mmap()?  What about
read+write patterns?

At most you could have an "open+write+close" function that encapsulate most
of the crap, with a few options to tell it what to do if it finds a symlink
or mismatched owner, what to do if it cannot do it in an atomic way, etc.

I suppose one could actually ask for a non-posix interface to do all those
three operations in one syscall, but I don't think the kernel people will
want to implement it.  It would make sense only if object stores become
commonplace (where this thing is likely an object store primitive, anyway).

> >> I've brought this up on linux-fsdevel and linux-ext4 but they (Ted)
> >> claim those exceptions aren't really a problem.
> >
> > Indeed they are not.  Code has been dealing with them for years.  You
> Code has been wrong for years to, based on the reason reports about
> file corruption with ext4.

Code written to *deal with files safely* by people who wanted to get it
right and actually checked what needs to be done, has been right for years.
And has piss-poor performance.

Code written by random joe which has no clue about the braindamages of POSIX
and Unix, well... this thread shows how much crap is really needed.

One can, obviously, have most filesystems be super-safe, and create a new
fadvise or something to say "this is crap, be unsafe if you can".
Performance will be poor, everything will be safe, and the extra fsyncs()
will not hurt much because the fs would do it anyway.

> > name the temp file properly, and teach your program to clean old ones up
> > *safely* (see vim swap file handling for an example) when it starts.
> What about restoring meta-data? File-owner?

Hmm, yes, more steps if you want to do something like that, as you must do
it with the target open in exclusive mode.  close target only after the
rename went ok.

But if the file owner is not yourself, you really should change it, not to
mention you might not want to complete the operation in the first place.

A lib for this is a really good idea :p

> > vim is a good example: nobody gets surprised by vim swap-files left over
> > when vim/computer crashes. And vim will do something smart with them if
> > it finds them in the current directory when it is started.
> I'm sure the vim code is far from trivial. I think this complexity is
> part of the reason most apps don't bother.

That I agree with completely.

> > BTW: safely removing a file is also tricky.  AFAIK, one must open it RW,
> > in exclusive mode. stat it by fd and check whether it is what one
> Exclusive mode? Linux doesn't know about mandatory locking (AFAIK).

Yeah... races everywhere...

> > expects (regular file, ownership).  unlink it by fd.  close the fd.
> >
> >> Is there a code snippet or lib function that handles this properly?
> >
> > I don't know.  I'd be interested in the answer, though :-)
> I'll ask glibc.

This really should be in a separate lib.  You want it to be usable outside
of glibc systems, and you CAN implement it (slow that it will be) on
anything POSIX.  You need only some help of the kernel to speed it up, and
that has to be detected at compile time (support) and runtime (availability
of the feature) anyway.

  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

Reply to: