[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: reading an empty directory after reboot is very slow



On Tue, Apr 14, 2015, at 04:22, Petter Adsen wrote:
> But if you create new files in that directory after deleting them, I
> expect the inodes get reallocated?

They do, but a directory does not store inodes.
http://en.wikipedia.org/wiki/Inode

For ext4:
https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout

And yes, directory entries are reused, just like inodes.  But if I read that page right, we do pay a price for somewhat limited backwards compatibility, and I bet there are performance and scalability reasons why it is reusing/rewriting sections of the htree on-disk instead of always writing it on its entirely (which would have it always optimized on disk).

> Is this specific to Linux/ext4?

Not really.  That said, a specific filesystem could optimize/compress on-media directory structures when it detects the directory is in dire need of a cleanup... at the cost of low performance (and high memory usage if the directories can host very large number of objects) peaks at runtime, which is something to be avoided.

Also, some filesystems always read/write directories as a whole, and thus they're likely to keep it compressed/optimized at all times.  But this will come with some sort of (not that large) arbritary limit on number of entries in a directory.

Leaving directory optimization to an out-of-band process makes the performance hit predictable and schedulable by the system administrator.  But it would be really nice if we could do it with the ext4 filesystem online (as opposed to have to unmount it first).

Anyway, just like file fragmentation, it really takes a pathogenic workload to get directories as messed-up as the one in this thread. It won't happen in the general use case.

Also, some of the Unix best-practices do address this kind of filesystem issue.  One such best-practice is that you don't remove just the files in ephemeral directories: you either use an ephemeral filesystem (tmpfs) in the first place which is optimized for reuse in Linux, Solaris and the BSDs, or you recursively remove the directory itself (and recreate it -- but do it safely, there are race condition concerns here if the parent is world-writeable).

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique de Moraes Holschuh <hmh@debian.org>


Reply to: