[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: reading an empty directory after reboot is very slow



Cam Hutchison wrote:
> I don't think dir_index has anything to do with it. An index speeds up
> lookups. You are not doing lookups; you are traversing the entire data
> structure.

I think you might be right about Vincent's problem.  It seems there is
a large amount of uncached data that needs to be read from disk.  It
takes a long time to read a lot of data from a large number of files.

> A B-tree data structure can take longer to traverse than a
> contiguous array data structure due to prefetching generally being
> beneficial to arrays, but less so to pointer-based structures.
> 
> It's slow because every block of the directory needs to be read to get
> the contents, even if every block contains empty entries. You don't know
> that until you've read it.

This depends exactly upon what you are doing, how you are doing it,
and where do you enclose the boundary of the problem.

For the usual case of opening a file the answer is no.  It doesn't
need to read the entire thing.  That is the key point in B-trees and
other external index data structures.  When looking up a name only a
much reduced set of disk blocks need to be read.

With a B-tree the root node is stored first and read as the first disk
block.  That first block contains as many index entries as will fit.
Let's say at least a 1000.  Those entries are sorted and contain a
multi-level index of other nodes.  If the directory is small it may
contain all of them in a single level index.  If the directory is
large then those entries will point to other directory nodes in a
multi-level index.  One page was loaded for the root node.  Root nodes
will usually be kept in cache ram making their accesses very fast.
Using information from the root node will allow seeking to the next
directory node that contains the filename being opened.

Assuming 1000 entries per page this additional page load will contain
another 1000 entries.  A directory huge by today's standards may
require 2-3 page reads only in order to open any file in a huge
directory.  Opening a file does not require reading the entire
directory database.

What is the order of lookup for a binary tree?  Log base 2.  A B-tree
with a 1000 entries per page would have a page lookup order as per the
number of entries per page.  Log base 1000 in this case.  This makes
for a very shallow tree structure.  This is the magic that allows a
very few page reads to open any individual file.  As compared to a
linear array search.

On the other hand if you enclose the problem at the highest level and
look at the work needed to read every file in the directory then of
course you will be traversing all of the data in both cases.  That
work outside of opening files is going to be the same.  The file
opening part will still be much better in the B-tree case because in
the linear directory array case opening the file takes much longer.

> >drwx------ 2 vlefevre vlefevre 8409088 2015-03-24 14:04:33 Mail/oldarc/cur/
> 
> Your large directory is about 3.5 times the size of this one, so we
> would expect all things being equal that it would take 30s to read the
> larger directory based on the time of reading your maildir.
> 
> One thing that is likely not equal is fragmentation. It is quite
> possible that your 30MB directory is fragmented across the disk and
> involves many seeks to read it all. If you really want to know if this
> is the case, use debugfs(8) to have a look:

Use filefrag to determine the file fragmentation?

  $ filefrag /var/tmp
  /var/tmp: 2 extents found

  $ filefrag -v ~/Mail/Lists/debian/user/cur
  ...lots of information...

Bob

Attachment: signature.asc
Description: Digital signature


Reply to: