[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Safe File Update (atomic)



On Sun, 02 Jan 2011, Olaf van der Spek wrote:
> On Sun, Jan 2, 2011 at 1:52 PM, Henrique de Moraes Holschuh
> <hmh@debian.org> wrote:
> > Olaf, O_ATOMIC is "difficult" in the kernel sense and in the long run.  It
> > is an API that is too hard to implement in a sane way, with too many
> > boundary conditions.
> >
> > OTOH, you don't need O_ATOMIC.  You need a way for easy application access
> > to a saner/simpler way to deal with files that require atomic replacement.
> > Time to switch to a plan B that can achieve it.  Do not lose track of your
> > final goal, and stop wasting time with O_ATOMIC (and aggravating fs
> > developers, which can only hurt your goal in the end).
> 
> Maybe I wasn't clear, in that case I'm sorry. To me, O_ATOMIC is
> mostly about the userspace API. The implementation isn't (that)
> important, so you're right.

Ok.  Here is one meta-API that could be useful (and yes, it is likely mostly
exactly what you call O_ATOMIC.  Whatever, my body is at 38.4°C right now
and the ferver is still climbing, so I don't even claim perfect sanity at
the moment.

Ted, if I could impose on you a single question, please either reply with a
short "no, already explained why the idea below is bogus elsewhere", "no,
new idea but wouldn't work because of a,b,c", "no, but I don't care to
explain why right now", and "yes, could work depending on the details".  I
won't pester you about it.

1. Create unlinked file fd (benefits from kernel support, but doesn't
require it).  If a filesystem cannot support this or the boundary conditions
are unaceptable, fail.  Needs to know the destination name to do the unliked
create on the right fs and directory (otherwise attempts to link the file
later would have to fail if the fs is different).

2. fd works as any normal fd to an unlinked regular file.

3. create a link() that can do unlink+link atomically.  Maybe this already
exists, otherwise needs kernel support.

The behaviour of (3) should allow synchrous wait of a fsync() and a sync of
the metadata of the parent dir.  It doesn't matter much if it does
everything, or just calling fsync(), or creating a fclose() variant that
does it.

Whether this should map to O_ATOMIC in glibc or be something new, I don't
care.  But if it is a flag, I'd highly suggest naming it O_CREATEUNLINKED or
something else that won't give people wrong ideas, as _nothing_ but the
final inode linking is atomic.

This will work for other uses, too.  It is a safe and easy way to create
temporary files for ipc, etc.

Or not, maybe it is completely broken and I should not write while in a
ferver.

> A userspace lib is fine with me. In fact, I've been asking for it
> multiple times. Result: no response.

You will need to actually find someone who wants to write such lib, or pay
someone to, or fire up a public funds campaign and contract it from someone
the community would trust to actually be able to complete the job, etc.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh


Reply to: