[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: dpkg --smallmem has a larger footprint than --largemem



On Sun, 14 Jan 2001, Ian Jackson wrote:

> Adam Heath writes ("dpkg --smallmem has a larger footprint than --largemem"):
> > So, in summary, dpkg will detect that it is on a low memory system,
> > but end up using more memory(this amount is equal to du *.list).
> 
> I haven't checked whether your change is correct, but it might well
> be.  But, I want to point out what the purpose of the --smallmem
> option is.  It is not designed to use less VM in total, but rather to
> reduce the working set.  The memory management in dpkg depends on the
> fact that we have a good VM system and enough swap, and on the fact
> that dpkg isn't long-running.  The idea is that the unused memory ends
> up sitting in swap but not in RAM.
> 
> However, the hash-based implementation you get with --largemem,
> although it is a more efficient data structure, has a working set
> nearly as big as the whole list of files.  The --smallmem only needs
> to keep in core those nodes which refer to the directories being
> processed.

Thanks for the explanation.  There is just one flaw in this.  dpkg has to read
in all .list files, and build a structure in memory(be it hash or a tree).  It
has to do this anytime it has to install(or otherwise modify) a package.  This
means it touches every node.  With multiple calls to dpkg, it will continue to
reread and reparse this data set, so it will not be efficient.

Building the hash-based version is faster than building the tree-based
version.  The total memory footprint is the same.  In the tree version, there
are lots of mallocs(even it is blocked malloc, it is still quite a bit), and
as such, can be fragmented.  I maintain that there isn't much data locality
for --smallmem, but there is more so for --largemem, as we allocate memory in
huge blocks, which for later .list files, will have little impact on the
working memory set.

The main complaint over the years is that for a simple install of a single
package, dpkg has to read in the entire database, when it only needs a few
files.  The proper way to fix this is to save a preparsed version, but that is
not immediately straight forward.

----BEGIN GEEK CODE BLOCK----
Version: 3.12
GCS d- s: a-- c+++ UL++++ P+ L++++ !E W+ M o+ K- W--- !O M- !V PS--
PE++ Y+ PGP++ t* 5++ X+ tv b+ D++ G e h*! !r z?
-----END GEEK CODE BLOCK-----
----BEGIN PGP INFO----
Adam Heath <doogie@debian.org>        Finger Print | KeyID
67 01 42 93 CA 37 FB 1E    63 C9 80 1D 08 CF 84 0A | DE656B05 PGP
AD46 C888 F587 F8A3 A6DA  3261 8A2C 7DC2 8BD4 A489 | 8BD4A489 GPG
-----END PGP INFO-----



Reply to: