[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#281057: dpkg memory exhausting when upgrading with --root



Package: dpkg
Version: 1.10.24
Severity: important

I have a package which contains around 5500 files. Furthermore, most of
these files are installed under a directory whose name contains the version
number of the package. Hence when upgrading this package, 5500 old files
disappear and 5500 new files appear.

Installing/removing/upgrading/downgrading this package into my real system
goes fine. Installing this into a chroot (with dpkg's --root option) goes
fine if it's not already installed there yet. Removing or purging it from
the chroot is also fine.

However, if I try to upgrade or downgrade it to my chroot system (using
"dpkg --root=/whatever -i packagename.deb" when another version is already
installed there) it usually leads to a memory exhaustion where dpkg starts
eating my system memory (I have 1GB ram plus some swap), and sooner or later
usually dpkg gives up with an out of memory error message, but sometimes my
kernel OOM killer starts killing processes.

I started looking at the problem more closely, and found the real cause, and
created a very-far-from-perfect half-workaround. I'll explain things in
chronological order as I got closer to the core of the problem.

>From now on all version numbers are according to dpkg_1.10.24 tarball's
main/processarc.c file.

The first thing I noticed is that for each file of the old package every
file of the new package is stated and in the mean time memory consumption of
dpkg keeps on growing.

Go go line 624. Here we are inside a double loop, the outer loop (starting
in line 591) iterates through the files of the old package, while the inner
loop (line 622) iterates on the files of the new package. When this code is
executed, the new files are already extracted.

Lines 624 and 625 try to stat the new files, and if it's successful, they
cache this piece of information so that these files won't be stat'ed again.
However if stat fails (be patient, I'll explain later why it might fail)
then still first a struct stat was malloced in line 624, and the pointer
pointing to it gets forgot forever in 626, so this failure of stat is not
cached, this file will be stated over and over again.

So, in this case, malloc in line 624 is executed 5500x5500 times, each time
allocating a 96-byte structure, which is not freed. This requires 3GB of
memory, and if my package had twice as many files it would requires four
times as much memory.

I attach a simple patch which only allocates a stat structure if the stat
call was successful.

After applying this patch to dpkg I found that these out of memory errors
disappeared. However, the situation is still not perfect. I noticed that
installing version x of my package to the chroot always takes 6-8 seconds
(and it also took 6-8 seconds with the old unpatched dpkg and didn't lead to
oom) while installing version y always takes one minute (this was failing
with oom previously, so it's now much better but very far from perfect).
However, to my real system both install within 6-8 seconds.

Tracing the difference between the two cases helped me cathing the bug. The
bug is in line 625. The lstat call tries to stat the files of the newly
installed package, but _outside_ the chroot. If they are there, everything
goes perfectly. However, if they are not there, since the package that I'm
upgrading under the chroot is not yet installed on my real system, then this
stat fails. This is why it triggers the oom bug described above, and this is
why my workaround mentioned above is not perfect, still in this case it
performs 5500^2 stat calls instead of 5500.

For this problem I do not (yet?) have a fix, for the first sight it seems to
be a little bit more complicated, so I'd rather ask you to properly fix it.

In the mean time, please do a stress-test of the --root option with stracing
all the file operations to catch all the possible similar bugs. At this
moment I'm not sure if replacing config files and other tricky issues are
all handled correctly with --root so that always the file under chroot is
taken into account. Please do some systematic testing to try to catch all
these kind of bugs.



Thanks,

Egmont
diff -Naur dpkg-1.10.24.orig/main/processarc.c dpkg-1.10.24/main/processarc.c
--- dpkg-1.10.24.orig/main/processarc.c	2004-10-27 11:06:43.000000000 +0200
+++ dpkg-1.10.24/main/processarc.c	2004-11-13 13:35:58.000000000 +0100
@@ -620,11 +620,14 @@
 	  "upgrade/downgrade", fnamevb.buf);
       if (!lstat(fnamevb.buf, &oldfs) && !S_ISDIR(oldfs.st_mode)) {
 	for (cfile = newfileslist; cfile; cfile = cfile->next) {
+	  struct stat st_tmp;
 	  if (!cfile->namenode->filestat) {
-	    cfile->namenode->filestat = (struct stat *) nfmalloc(sizeof(struct stat));
-	    if (lstat(cfile->namenode->name, cfile->namenode->filestat)) {
+	    if (lstat(cfile->namenode->name, &st_tmp)) {
 	      cfile->namenode->filestat= 0;
 	      continue;
+	    } else {
+	      cfile->namenode->filestat = (struct stat *) nfmalloc(sizeof(struct stat));
+	      memcpy(cfile->namenode->filestat, &st_tmp, sizeof(struct stat));
 	    }
 	  }
 	  if (S_ISDIR(cfile->namenode->filestat->st_mode))

Reply to: