[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: reading an empty directory after reboot is very slow

On 2015-04-23 10:10:54 -0500, David Wright wrote:
> Quoting Vincent Lefevre (vincent@vinc17.net):
> > On 2015-04-22 23:28:46 -0500, David Wright wrote:
> > > No, I wasn't expecting mutt to use mairix. But I thought you might be
> > > using it. Otherwise, why do you index them?
> > 
> > I use mairix when I need a "body" search first, otherwise such a
> > search would be awfully slow with Mutt. Then I can open the generated
> > folder with Mutt, and try to do more filtering to find what I want.
> I was under the impression that mairix could do both.

mairix does two things: index mail messages and search. For the latter
operation, it just creates a mail folder with the results. If I need
to refine a search, I can either re-run mairix (a new mail folder is
created) or use the search feature of the MUA (e.g. Mutt) after opening
the folder created by mairix.

> > > I also wondered what the problem would be with putting the thousands
> > > of emails in a general purpose database.
> > 
> > But Mutt can't use such a database.
> I was thinking of a scenario where you retrieve the matching emails
> and place them in a scratch directory, then run mutt on that folder
> so you can still read the results with an email interface.

This is exactly what mairix does. No need for a database. Well, mairix
has a database for its index, but it is just the index (it doesn't
contain the messages themselves). For what mairix does, it is better
to use the maildir format since mairix just creates symbolic links to
the matching messages, while if the mbox format were used, it would
have to copy the entire messages. With the maildir format, mairix is
always immediate (once the database is in the cache, otherwise this
takes a few seconds).

> Of course I have no idea whether you're trying to match a few emails
> or thousands at a time.

For some searches, it's more practical to do this entirely from Mutt,
because it allows interactivity without losing the context (with mairix,
I would have to restart Mutt for each new refined search).

Anyway remember that the main slowness I have is when I retag messages.
I could improve my script to cache the message-id -> filename mapping,
which would make it much faster. But what I like to understand is why
my current script is so much slower than Mutt (when the mailbox is not
in the cache), while both are reading about the same data. I'll have
to do some tests with slight changes in my script to see if this makes
a difference. Hmm... something I've just thought about is to look at
the cache size in both cases:
  1. Drop the caches. Open the mailbox with Mutt.
  2. Drop the caches. Use my script.

Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

Reply to: