[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#832326: very slow processing of Contents files



Hallo,
* Julian Andres Klode [Sun, Jul 24 2016, 01:24:12PM]:
> Control: tag -1 moreinfo
> 
> On Sun, Jul 24, 2016 at 12:43:23PM +0200, Eduard Bloch wrote:
> > Package: apt
> > Version: 1.3~pre2
> > Severity: minor
> > 
> > Hello,
> > 
> > since Contents file handling has been added recently, the processing of
> > them seems to be very slow. It takes about two minutes (guessed, not
> > measured) where all other stuff is done within the first ~10 seconds.
> > 
> > <first analysis>
> > I think, the basic problem here is the massive size of the data in the
> > Index files - they are already big and compression ratio is very high.
> > Uncompressed versions of both amd64 and i386 add up to about one
> > gigabyte! OTOH when I zcat them both, it takes just about 5 seconds!
> > So I guess the problem is the amount of data that needs to be rotated
> > while patching the code.
> > I measured a bit how ed performs and it takes about 11 seconds for
> > Contents-amd64.gz (and about 166k of patch lines in a combined patch).
> > Patch was made before from the series of related pdiff files, of course.
> 
> Not sure what is happening at your side, but APT should normally store
> Contents files using LZ4 compression, not gzip; unless you force it to
> do otherwise.

Hm? It's the first time I come in touch with LZ4 and have no idea what
you mean.

> We specifically switched to LZ4 to solve this issue.
> 
> Does your system not use .lz4 compressed Contents files?
> 
> > APT::Compressor::lz4::Binary "false";
> 
> My system says:
> 
> APT::Compressor::lz4::Binary "lz4";

Shall I change it and report back in a couple of days?

> but maybe just because I have an lz4 binary, it should not make a
> difference.

ATM I have no idea how this is calculated or enabled or disabled. But it
seems to be installed though:

ii  liblz4-1:amd64                             0.0~r131-2
amd64                      Fast LZ compression algorithm library -
runtime

But anyhow, I am wondering... the obvious guess is taht the problem is
the complexity (CPU time or memory) and not IO; how is extra compression
supposed to fix it? IMHO it would rather make it worse.

Regards,
Eduard.

-- 
Ein Supercomputer ist eine Maschine, die eine Endlosschleife in nur
2 Sekunden durchläuft.


Reply to: