[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Debbugs: The Next Generation



On Sat, Aug 18, 2001 at 09:01:08AM -0400, tytso@mit.edu wrote:
> On Sat, Aug 18, 2001 at 01:52:56AM -0400, Matt Zimmerman wrote:
> > On Tue, Aug 14, 2001 at 04:24:33PM +1000, Anthony Towns wrote:
> > > That's not true: ext2 queries and directory lists are the problem here,
> > > in that they're O(N) and O(N^2) rather than O(1)/O(N). Using reiser or
> > > subdirectories would reduce those to O(lg(N)) and O(N lg(N)).
> > 
> > Why on earth are directory lists O(N^2)?  I don't know much about ext2
> > internals, but that should be an O(N) operation in any sane filesystem.  
> 
> Doing a directory list is indeed O(N) for ext2.  
> 
> However, iterating through the directory using readdir() and then
> stating or opening each file in readdir order is O(N**2).  There's a
> very obvious optimization which makes this O(N), which I thought ext2
> was using, but as it turns out, we weren't.
> 
> Thanks for bringing this to my attention; here's the very simple
> kernel patch which should speed up a du or find of very large
> directories significantly.

I'm guessing this will have no real effect.  In an effort to optimize,
glibc does not call readdir(); instead it calls getdents64() and gets
buffers.  You'll go just past the one you want and always have to loop
all the way around.

Perhaps if you keep the search start at the beginning of the entries
returned by the last getdents64() call?  Is that feasible?  It should
not penalize as badly.

-- 
Daniel Jacobowitz                           Carnegie Mellon University
MontaVista Software                         Debian GNU/Linux Developer



Reply to: