[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: the correct way to read a big directory? Mutt?



On 2015-04-27 09:44:09 +0200, Vincent Lefevre wrote:
> On 2015-04-24 21:41:41 -0500, David Wright wrote:
> > Quoting Vincent Lefevre (vincent@vinc17.net):
> > > This is now done in my script, but I had to use the ReadDir module
> > > from CPAN, since both readdir implementations in Perl (the standard
> > > readdir Perl function and POSIX::readdir) just return the file name.
> > > And this ReadDir module isn't available in a Debian package.
> > 
> > Python's library function listdir suffers the same way. If you want
> > the inode, you have to call stat to get it. (I haven't looked for
> > external modules like ReadDir as I don't have very large directories.)
> 
> I haven't tried, but I don't think that a stat call would solve the
> problem: most stat information isn't in the directory entries, thus
> it will have to be loaded from disk in some arbitrary order, just
> like if the first block of a file were read (which is precisely what
> I want to avoid at this moment).

I've just done a test in my Perl script, and actually, using stat
in the directory order is actually very fast on ext3, so that it
would solve the problem. I think that the reason is that inode
information is grouped at some specific place on the partition (as
someone said in another message), and in a compact way I assume,
so that there are few blocks to read, which is not the case when
reading one line of each file (= one block of each file).

However I think that using the ReadDir module (which provides the
inode with needing a stat call) guarantees more efficiency, though
this may not be noticeable in most cases.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


Reply to: