[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: RFC: improve dpkg-scanpackages performance with cached md5sums



Michael Burian <michael.burian@sbg.at> writes:

> Problem:
>
> Creating Packages.gz with dpkg-scanpackages takes lot's of time for
> large repositories.
>
> The main reasons why it is so slow is that all checksums of all
> packages, even those that did not change from the previous run, are
> recalculated every time.
>
> Solution:
>
> I've extended dpkg-scanpackages to accept a "--md5cache" | "-5"
> command line option that enables caching and reusing of md5sums.
>
> When not used one ends up with stock dpkg-scanpackages behavior where
> all checksum are recalculated every time. Else md5sums of scanned
> packages are cached on the first run and reused on successive runs.
>
> With cached md5sums, the time to create Packages.gz for my private
> repository (~600MB) dropped from over 1 minute to about 7 seconds on a
> PII/400Mhz.
>
> Would it make sense to include such a feature into official
> dpkg-scanpackages?

Solution: use apt-ftparchive.

apt-ftparchive does exactly what you need and works for multiple
distributions and suites. It is that the official repositories use.

MfG
        Goswin



Reply to: