[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: dpkg speed (was: Re: PROPOSAL: Extrafiles)



On 10 Jun 1998, Jukka Neppius wrote:

>   I don't even have source for current dpkg:) I only made simple (~300
> lines) test program to read 'available'.  It finds every tag (Package,
> Installed-Size, ...)  for every package and qsorts packages.  After
> sort most operations should be fast.  It should have enough error
> checking.  It probably could be used in dpkg or in dselect with few
> small modifications (data structure).  I could mail it to anyone who
> wants to test it.

Hmm, well sorting and finding tags is not really comparable to the parsing
that is required by at least apt, take a look at
  http://www.debian.org/~jgg/deity/cache.html
That is the in-memory (and cached on disk) structure that the packages
files are broken into.
 
The cache generator in APT uses a few faulty assumptions so isn't 
terribly fast, I have a design for one that should be a few magnitudes
faster but I don't think I'll implement it soon.

>   Of course reading many small *.list files is very slow.  New format
> is only possibility to speed it.  File size becomes large (=slow), so
> some kind of compressed/binary format is needed.  (I failed to find
> Jason Gunthorpe's proposal for new format.)

Strangely so did I. I wonder where it went. Since I don't expire my
sent-mail I've pulled it out and put it at
  http://www.debian.org/~jgg/mail.txt

Think of it as some thoughts on the subject and why we should not use a
generic dbm. You should read and understand the apt cache document before
reading this - it is the same idea.

With a mmap'd data structure like I explore, multiple dpkg startup time
would be very close to instant on a wide class of machines. 

>  Also 'Depends' lines in available could cause a problem.  When
> removing a package, it is necessary to scan all installed packages to
> find dependencies (at least i believe so:).  This is rather slow.
> Simplifying 'Depends' line format would help ('|' is unnecessary
> because 'Provides' could be used instead.)

This is very fast with APT's format, it actually does alot of such
'reverse' lookups. 

Jason


--
To UNSUBSCRIBE, email to debian-dpkg-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org


Reply to: