[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: the correct way to read a big directory? Mutt?



Le sextidi 6 floréal, an CCXXIII, Vincent Lefevre a écrit :
> This is not that simple. I want my script to be very reliable.

Well, your script is in Perl, so implicitly you consider that CPU cost is
negligible. If you manage to optimize everything else (or make the
processing more complex) so that it becomes CPU-bound, then you will have to
consider reimplementing in C.

Until then, I believe you are right to trust Perl's IO buffering.

> In particular, if there is a message without a Message-ID and
> with "\nMessage-ID" in the body, I want to detect it. This kind
> of thing really happens in practice (though this is rare), e.g.
> due to some buggy mail software that breaks the headers and put
> a part of them in the body. I also want to check the format of
> the headers and possible duplicate Message-ID. What my script
> really does is:

IMHO, if you really want to validate the format of the headers, I advise to
read the whole header into a string and work from it. Something like:

  my $header = "";
  while (<$file>) {
    last if $_ eq "\n"; # or /^\r?\n\z/ if you do not trust line ends
    $header .= $_;
  }
  my @header = split /\n(?!\s)/, $header;

>     while (<FILE>)

Out of curiosity, do you have a particular reason not to use a real variable
for your file handles?

>         /^Message-ID:\s+(<\S+>)( \(added by .*\))?$/i or next;

I have never seen this "added by" in my mails, but assuming it is necessary
for you, note that it may be written like that:
"Message-ID: <foo@bar> (added\n\tby someone)\n"

> With zsh

Yay.

> One can choose to sort the result, but zsh doesn't support sorting
> by inode number. I've sent a feature request.

I had the same reflex reading your mail: look up in zshexpn(1) if inode
sorting was possible.

Regards,

-- 
  Nicolas George

Attachment: signature.asc
Description: Digital signature


Reply to: