Bug#832326: very slow processing of Contents files
Hallo,
* Julian Andres Klode [Sun, Jul 24 2016, 01:24:12PM]:
> Control: tag -1 moreinfo
>
> On Sun, Jul 24, 2016 at 12:43:23PM +0200, Eduard Bloch wrote:
> > Package: apt
> > Version: 1.3~pre2
> > Severity: minor
> >
> > Hello,
> >
> > since Contents file handling has been added recently, the processing of
> > them seems to be very slow. It takes about two minutes (guessed, not
> > measured) where all other stuff is done within the first ~10 seconds.
> >
> > <first analysis>
> > I think, the basic problem here is the massive size of the data in the
> > Index files - they are already big and compression ratio is very high.
> > Uncompressed versions of both amd64 and i386 add up to about one
> > gigabyte! OTOH when I zcat them both, it takes just about 5 seconds!
> > So I guess the problem is the amount of data that needs to be rotated
> > while patching the code.
> > I measured a bit how ed performs and it takes about 11 seconds for
> > Contents-amd64.gz (and about 166k of patch lines in a combined patch).
> > Patch was made before from the series of related pdiff files, of course.
>
> Not sure what is happening at your side, but APT should normally store
> Contents files using LZ4 compression, not gzip; unless you force it to
> do otherwise.
Hm? It's the first time I come in touch with LZ4 and have no idea what
you mean.
> We specifically switched to LZ4 to solve this issue.
>
> Does your system not use .lz4 compressed Contents files?
>
> > APT::Compressor::lz4::Binary "false";
>
> My system says:
>
> APT::Compressor::lz4::Binary "lz4";
Shall I change it and report back in a couple of days?
> but maybe just because I have an lz4 binary, it should not make a
> difference.
ATM I have no idea how this is calculated or enabled or disabled. But it
seems to be installed though:
ii liblz4-1:amd64 0.0~r131-2
amd64 Fast LZ compression algorithm library -
runtime
But anyhow, I am wondering... the obvious guess is taht the problem is
the complexity (CPU time or memory) and not IO; how is extra compression
supposed to fix it? IMHO it would rather make it worse.
Regards,
Eduard.
--
Ein Supercomputer ist eine Maschine, die eine Endlosschleife in nur
2 Sekunden durchläuft.
Reply to: