[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: the correct way to read a big directory? Mutt?

David Wright wrote on 04/27/2015 17:29:
> Quoting Jörg-Volker Peetz (jvpeetz@web.de):
>> Correction of the "jot" command arguments:
>> Jörg-Volker Peetz wrote on 04/25/2015 16:15:
>>> <snip>
>>>> for i in `seq 5000`
>>>> do
>>>>   date=$((10000000+i))
>>>>   cat <<EOF > "$dir/cur/$date.1.host:2,S"
>>> <snip>
>>> the situation becomes much worse if you generate the filenames from a random
>>> sequence. Try to replace the command "seq" by "jot" from the package athena-jot like
>>> for i in $(jot 5000 1 5000)
>> for i in $(jot -r 5000 1 5000)
>>> That makes the numerical order of the files have a random i-node number sequence .
> Apart from the fact that it doesn't (it only randomises the
> filenames), it won't even generate 5000 files because of
> duplication.
Have you tried benchmarking both cases? I definitely see very different timings
(on ext4) for the grep benchmark.
Maybe my wording is unclear. I haven't had a look how "grep -r" traverses the
directory contents. I thought it would search the files in a sequence according
to some lexicographical order of their names.

Reply to: