Re: dpkg --smallmem has a larger footprint than --largemem

To: Adam Heath <adam@doogie.org>
Cc: debian-dpkg@lists.debian.org
Subject: Re: dpkg --smallmem has a larger footprint than --largemem
From: Ian Jackson <ian@davenant.greenend.org.uk>
Date: Sun, 14 Jan 2001 21:14:31 +0000 (GMT)
Message-id: <[🔎] 14946.5815.372384.405527@davenant.relativity.greenend.org.uk>
In-reply-to: <[🔎] Pine.LNX.4.21.0101141448430.5777-100000@yakko.doogie.org>
References: <[🔎] 14946.4031.119123.572350@davenant.relativity.greenend.org.uk> <[🔎] Pine.LNX.4.21.0101141448430.5777-100000@yakko.doogie.org>

Adam Heath writes ("Re: dpkg --smallmem has a larger footprint than --largemem"):
> Thanks for the explanation.  There is just one flaw in this.  dpkg
> has to read in all .list files, and build a structure in memory(be
> it hash or a tree).  It has to do this anytime it has to install(or
> otherwise modify) a package.  This means it touches every node.
> With multiple calls to dpkg, it will continue to reread and reparse
> this data set, so it will not be efficient.

This is why it's much better to run dpkg once to have it do multiple
things :-).

> Building the hash-based version is faster than building the tree-based
> version.

Even on a machine with not enough memory to contain the whole
hash-based verion in core at once ?

>  The total memory footprint is the same.  In the tree version, there
> are lots of mallocs(even it is blocked malloc, it is still quite a
> bit), and as such, can be fragmented.  I maintain that there isn't
> much data locality for --smallmem, but there is more so for
> --largemem, as we allocate memory in huge blocks, which for later
> .list files, will have little impact on the working memory set.

I think you should do some measurements before you claim this.  When I
first designed and implemented that stuff I did various tests, and
--smallmem was considerably faster if memory is very short.

> The main complaint over the years is that for a simple install of a
> single package, dpkg has to read in the entire database, when it
> only needs a few files.  The proper way to fix this is to save a
> preparsed version, but that is not immediately straight forward.

There are lots of reliability issues with that.  A better solution
(which I've thought for some time would be a good idea) would be to
have the file lists for several packages together in one .list file
(or equivalent); last time I looked most of the time spent during
startup is not in accessing and building the in-core database, but
simply opening and reading all those thousands of small files.

Ian.

Reply to:

Follow-Ups:
- Re: dpkg --smallmem has a larger footprint than --largemem
  - From: Ben Collins <bcollins@debian.org>

References:
- Re: dpkg --smallmem has a larger footprint than --largemem
  - From: Ian Jackson <ian@davenant.greenend.org.uk>
- Re: dpkg --smallmem has a larger footprint than --largemem
  - From: Adam Heath <adam@doogie.org>

Prev by Date: Re: dpkg --smallmem has a larger footprint than --largemem
Next by Date: Re: dpkg --smallmem has a larger footprint than --largemem
Previous by thread: Re: dpkg --smallmem has a larger footprint than --largemem
Next by thread: Re: dpkg --smallmem has a larger footprint than --largemem
Index(es):
- Date
- Thread