[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#829100: lintian: [patch] Warn about over-eagerly xz-compressed data.tar.xz



Package: lintian
Version: 2.5.45
Severity: wishlist
Tags: patch

Dear Maintainer,

as not known to everybody, xz's higher compression levels have -
besides improving compression of big files - the side effect of taking
a lot of memory for the dictionary, even when unpacking. There is
however no sense in using a compression level that (roughly) takes
more DictSize than the size of the uncompressed file. [1] has a
discussion on this,

In other words,

| override_dh_builddeb:
|     dh_builddeb -- -Zxz -z9

in the traceroute package triggered an OOM upon installation on an
embedded hardware with 128MiB RAM since ...

| $ ar x traceroute_1%3a2.0.20-2+b1_armel.deb data.tar.xz
| $ xz --list --verbose --verbose data.tar.xz
| (...)
|   Compressed size:    47,9 KiB (49.056 B)
|   Uncompressed size:  130,0 KiB (133.120 B)
| (...)
|   Memory needed:      65 MiB
| (...)

... it caused an allocation of 65 Mibyte for nothing on an also
otherwise busy computer.

In my opinion lintian is the right place to place a warning about such
unncessary ressource usage.

The patch attached is just a proof of concept and not ready for
production yet, especially since data.tar.xz is unpacked (and later
removed) to the current working directory.

Let me know if you consider such a check a good idea, then I'll do the
final polishing and sane error handling. Also the alarm threshold will
probably need some reconsideration.

Example output:

W: traceroute: overeager-compression-for-data-tarball 65.0 MiB RAM required for 0.1 MiB uncompressed data

Aside, does the lab provide a good place for extraction, or should I
just use tempdir?

    Christoph

[1] https://www.mirbsd.org/permalinks/wlog-10_e20130104-tg.htm


-- System Information:
Debian Release: stretch/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 4.4.13 (SMP w/4 CPU cores)
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)
Shell: /bin/sh linked to /bin/dash
Init: unable to detect

Versions of packages lintian depends on:
ii  binutils                          2.26-12
ii  bzip2                             1.0.6-8
ii  diffstat                          1.61-1
ii  file                              1:5.28-1
ii  gettext                           0.19.8.1-1
ii  hardening-includes                2.8+nmu2
ii  intltool-debian                   0.35.0+20060710.4
ii  libapt-pkg-perl                   0.1.29+b5
ii  libarchive-zip-perl               1.57-1
ii  libclass-accessor-perl            0.34-1
ii  libclone-perl                     0.38-1+b1
ii  libdata-alias-perl                1.20-1+b1
ii  libdpkg-perl                      1.18.7
ii  libemail-valid-perl               1.198-1
ii  libfile-basedir-perl              0.07-1
ii  libipc-run-perl                   0.94-1
ii  liblist-moreutils-perl            0.413-1+b1
ii  libparse-debianchangelog-perl     1.2.0-8
ii  libperl5.22 [libdigest-sha-perl]  5.22.2-1
ii  libtext-levenshtein-perl          0.13-1
ii  libtimedate-perl                  2.3000-2
ii  liburi-perl                       1.71-1
ii  libyaml-libyaml-perl              0.41-6+b1
ii  man-db                            2.7.5-1
ii  patchutils                        0.3.4-1
ii  perl                              5.22.2-1
ii  t1utils                           1.39-2
ii  xz-utils                          5.1.1alpha+20120614-2.1

Versions of packages lintian recommends:
ii  dpkg                                 1.18.7
pn  libperlio-gzip-perl                  <none>
ii  perl                                 5.22.2-1
ii  perl-modules-5.22 [libautodie-perl]  5.22.2-1

Versions of packages lintian suggests:
pn  binutils-multiarch     <none>
ii  dpkg-dev               1.18.7
ii  libhtml-parser-perl    3.72-1
ii  libtext-template-perl  1.46-1

-- no debconf information

diff --git a/checks/deb-format.desc b/checks/deb-format.desc
index 85b9a7a..add7893 100644
--- a/checks/deb-format.desc
+++ b/checks/deb-format.desc
@@ -92,3 +92,13 @@ Info: The data portion of this binary package uses a non-compressed
  .
  Except if data is non-compressible, use gzip for
  maximum compatibility and speed, and xz for maximum compression ratio.
+
+Tag: overeager-compression-for-data-tarball
+Severity: normal
+Certainty: certain
+Info: The data portion of this binary package was xz-compressed with
+ a compression level above reason. Creating and also unpacking it will
+ use a lot of RAM without any benefit.
+ .
+ Reduce the compression level to a value where the uncompressed size
+ is not bigger than the related dictionary size. See xz(1) for details.
diff --git a/checks/deb-format.pm b/checks/deb-format.pm
index e0b750a..841066d 100644
--- a/checks/deb-format.pm
+++ b/checks/deb-format.pm
@@ -164,6 +164,31 @@ sub run {
             } elsif ($type eq 'udeb'
                 && $data_member !~ m/^data\.tar\.[gx]z$/) {
                 tag 'udeb-uses-unsupported-compression-for-data-tarball';
+            } elsif ($data_member eq 'data.tar.xz') {
+                my $success = spawn($opts, ['ar', 'x', $deb, $data_member]);
+                if ($success) {
+                    my $uncompressed;   # in MiB
+                    my $memory_needed;  # in MiB
+                    open(my $fd, '-|', 'xz', '--list', '--verbose', '--verbose', $data_member) or die;
+                    while (my $line = <$fd>) {
+                        chomp($line);
+                        ($line =~ /^\s+Uncompressed size: .* \(([0-9]+) B\)/) and
+                            ($uncompressed = $1 / 1048576);
+                        ($line =~ /^\s+Memory needed:\s+([0-9]+) MiB/) and
+                            ($memory_needed = $1);
+                    }
+                    close ($fd);
+                    # warn if
+                    # - more than 10 MiB is needed for decompression and
+                    # - memory needed is >120% of uncompressed size
+                    if ($uncompressed && $memory_needed &&
+                        $memory_needed > 10 &&
+                        $memory_needed > $uncompressed * 1.2) {
+                        tag 'overeager-compression-for-data-tarball',
+                            sprintf ('%.1f MiB RAM required for %.1f MiB uncompressed data', $memory_needed, $uncompressed);
+                    }
+                    unlink ($data_member);
+                }
             } elsif ($data_member eq 'data.tar.lzma') {
                 tag 'uses-deprecated-compression-for-data-tarball', 'lzma';
                 # Ubuntu's archive allows lzma packages.

Attachment: signature.asc
Description: Digital signature


Reply to: