[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: reading an empty directory after reboot is very slow



On 2015-04-26 13:26:41 -0500, David Wright wrote:
> Quoting Vincent Lefevre (vincent@vinc17.net):
> > On 2015-04-24 22:01:37 -0500, David Wright wrote:
> > > Quoting Vincent Lefevre (vincent@vinc17.net):
> > > > So, I would say that this is a bug. POSIX says in
> > > > http://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html
> > > 
> > > It may well be. But I'm just presenting facts about the
> > > filesystems that I assumed you were using when you posted
> > > https://lists.debian.org/debian-user/2015/04/msg00651.html ,
> > > not the theory of a POSIX-compliant system.
> > 
> > What's important when designing a file system is what should happen,
> > not what actually happens.
> 
> Yes, but I don't design filesystems. I use them.

The context was the design of ext3: Earlier in this thread, Bob Proulx
said: "(I always wondered why they didn't simply take the *last* entry
and move it down to the deleted entry and simply keep the array always
compacted. I wonder. But they didn't do it that way.)"

And Kushal Kumaran replied: "Moving entries around breaks ongoing
readdir operations. If a readdir has gone past the file being removed,
and you moved the last entry there, the entry being moved would be
missed, despite *it* not being the entry added or removed."

And you replied: "I don't think this matters. There's no guarantee
that another process isn't writing to that directory while you are
working your way along the entries."

Perhaps I should have pointed that what Kushal Kumaran said was a
POSIX requirement. So, this matters, and I replied: "But with the
current solution (no automatic moving of an entry), you can't miss
an entry that hasn't been removed."

Now, what I didn't know and what you showed is that ext3 actually
moves the entry for renames. This part is a bug. But what I also
meant is that not moving the last entry (i.e. not implementing
Bob Proulx's suggesting) is necessary to make sure that this file
won't be missed.

In short, ext3 is bad for renames, but if Bob Proulx's suggestion
were implemented (without a workaround such that caching the whole
directory after it is opened[*]), it would be much worse as arbitrary
entries would be missed in a readdir sequence.

[*] Doing that would also solve the problem with renames.

> > In case of bugs, anything can happen, so
> > that's not interesting at all.
> 
> Au contraire mon frère, they're very important as you have to live
> with them and work around them. [...]

I agree for an end user, but see the context. Now, IMHO, for renames,
perhaps the ext3 behavior is not much a problem (at least if the user
is aware of it) because in any case, readdir() may give the old
filename and this filename is quite useless in practice (one can't
stat or open it, i.e. a second directory parsing is needed in this
case). But missing arbitrary entries corresponding to filenames that
have been neither removed nor renamed would be much more critical,
and I don't think there is a possible safe workaround, e.g. if files
are constantly added and removed in the directory (for instance, I do
exhaustive tests as I've explained somewhere in this thread, and I use
such a directory with results automatically added and removed once
taken into account -- no renames here --, but it is quite important
that old results are found during a single readdir sequence).

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


Reply to: