[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#715035: lintian: Possible memory optimizations



Package: lintian
Version: 2.5.13
Severity: minor

Hi,

I have spent a little time looking at the memory consumption of
Lintian.  I have already optimized a few trivial cases, but I have
also seen futher potential that may be worth deploying.

Method: I have only looked at the memory consumed by L::Collect at the
end of the run (i.e. the memory data you get with -dddd from the
master branch).  I suspect there is also a lot to be gained from not
slurping copyright files (etc.) without any regards to their size in
checks.

Observations so far:
 - (sorted_)index is(/are) the primary memory consumer(s).
   - fortunately, they share a large part of the memory so the output
     of "-dddd" looks worse than it is.
   - AFAICT I can tell, memory seems to be "leaked" by Perl allocating
     large buffers for the strings[1].  With my test case source:linux,
     I suspect we can save about 5MB if we can reduce the size of
     these buffers to problem.
 - changelog is also pretty expensive and could probably very often be
   shared between different L::Collect instances related to the same
   processable group.  For lintian, it is about 2Mb for both the .deb
   and the source.  The common "non-sharing" case would be binNMUs or
   a bug in the package[2].
   - If we can dedup this on a disk level, we are also recovering
     disk storage.  I haven't checked if this is worth it even for
     lintian.d.o.  The savings will be a lot less on the disk though,
     since Perl/Parse::DebianChangelog spends more memory on it than
     its actual size (about factor 4 for lintian's changelog)[3].

I have attached the memory usage information of running lintian with
-dddd (-X man) on some of the linux binaries and their source package,
which can be used as reference.

~Niels

[1] This makes sense if the strings are to be changed later, but
generally this strings will only be read.

[2] E.g. installing the wrong changelog in one of the binary
packages. :)

[3] If we are doing this disk de-duplication, we can probably trivally
apply it to copyright files as well.  But back to memory ...
N: Memory usage [source:linux/3.2.20-1]: 35.91 MB
N:   -- base_dir: 120.00 B
N:   -- binaries: 105.41 kB
N:   -- binary_field: 1022.11 kB
N:   -- binary_relation: 1513.17 kB
N:   -- changelog: 1440.98 kB
N:   -- debfiles: 116.00 B
N:   -- diffstat: 116.00 B
N:   -- field: 109.57 kB
N:   -- file_info: 3.92 MB
N:   -- index: 29.32 MB
N:   -- is_non_free: 116.00 B
N:   -- name: 64.00 B
N:   -- native: 20.00 B
N:   -- relation: 12.07 kB
N:   -- relation_noarch: 5.85 kB
N:   -- sorted_index: 28.74 MB
N:   -- source_field: 1.71 kB
N:   -- type: 64.00 B
N:   -- unpacked: 116.00 B
N: Memory usage [binary:linux-source-3.2/3.2.20-1/all]: 1462.71 kB
N:   -- base_dir: 136.00 B
N:   -- changelog: 1440.99 kB
N:   -- control: 144.00 B
N:   -- control-index: 2.38 kB
N:   -- field: 2.00 kB
N:   -- file_info: 1249.00 B
N:   -- index: 9.39 kB
N:   -- is_non_free: 52.00 B
N:   -- java_info: 92.00 B
N:   -- md5sums: 904.00 B
N:   -- name: 72.00 B
N:   -- native: 64.00 B
N:   -- objdump_info: 92.00 B
N:   -- relation: 4.98 kB
N:   -- scripts: 92.00 B
N:   -- sorted_control-index: 1.56 kB
N:   -- sorted_index: 8.42 kB
N:   -- type: 64.00 B
N: Memory usage [binary:linux-support-3.2.0-2/3.2.20-1/all]: 1476.09 kB
N:   -- base_dir: 140.00 B
N:   -- changelog: 1440.99 kB
N:   -- control: 148.00 B
N:   -- control-index: 3.66 kB
N:   -- field: 1473.00 B
N:   -- file_info: 3.01 kB
N:   -- index: 22.93 kB
N:   -- is_non_free: 52.00 B
N:   -- java_info: 92.00 B
N:   -- md5sums: 2.45 kB
N:   -- name: 76.00 B
N:   -- native: 64.00 B
N:   -- objdump_info: 92.00 B
N:   -- relation: 1.59 kB
N:   -- scripts: 625.00 B
N:   -- sorted_control-index: 2.78 kB
N:   -- sorted_index: 21.78 kB
N:   -- type: 64.00 B
N:   -- unpacked: 152.00 B
N: Memory usage [binary:linux-doc-3.2/3.2.20-1/all]: 7.65 MB
N:   -- base_dir: 132.00 B
N:   -- changelog: 1440.99 kB
N:   -- control: 140.00 B
N:   -- control-index: 2.38 kB
N:   -- field: 1.57 kB
N:   -- file_info: 1016.66 kB
N:   -- index: 5.03 MB
N:   -- is_non_free: 52.00 B
N:   -- java_info: 92.00 B
N:   -- md5sums: 1150.69 kB
N:   -- name: 68.00 B
N:   -- native: 64.00 B
N:   -- objdump_info: 92.00 B
N:   -- relation: 1476.00 B
N:   -- scripts: 92.00 B
N:   -- sorted_control-index: 1.56 kB
N:   -- sorted_index: 4.95 MB
N:   -- type: 64.00 B
N:   -- unpacked: 144.00 B
N: Memory usage [binary:linux-image-3.2.0-2-amd64/3.2.20-1/amd64]: 9.28 MB
N:   -- base_dir: 148.00 B
N:   -- changelog: 1441.00 kB
N:   -- control: 156.00 B
N:   -- control-index: 5.65 kB
N:   -- field: 2.21 kB
N:   -- file_info: 893.27 kB
N:   -- index: 2.78 MB
N:   -- is_non_free: 52.00 B
N:   -- java_info: 92.00 B
N:   -- md5sums: 532.94 kB
N:   -- name: 80.00 B
N:   -- native: 64.00 B
N:   -- objdump_info: 4.35 MB
N:   -- relation: 7.02 kB
N:   -- scripts: 602.00 B
N:   -- sorted_control-index: 4.63 kB
N:   -- sorted_index: 2.74 MB
N:   -- strings: 156.00 B
N:   -- type: 64.00 B
N:   -- unpacked: 156.00 B
N: Memory usage [binary:linux-manual-3.2/3.2.20-1/all]: 4.65 MB
N:   -- base_dir: 136.00 B
N:   -- changelog: 1440.99 kB
N:   -- control: 144.00 B
N:   -- control-index: 2.38 kB
N:   -- field: 1.96 kB
N:   -- file_info: 338.43 kB
N:   -- index: 2.70 MB
N:   -- is_non_free: 52.00 B
N:   -- java_info: 92.00 B
N:   -- md5sums: 566.65 kB
N:   -- name: 72.00 B
N:   -- native: 64.00 B
N:   -- objdump_info: 92.00 B
N:   -- relation: 2.12 kB
N:   -- scripts: 92.00 B
N:   -- sorted_control-index: 1.56 kB
N:   -- sorted_index: 2.65 MB
N:   -- type: 64.00 B
N: Memory usage [binary:linux-manual-3.2/3.2.35-2/all]: 4.75 MB
N:   -- base_dir: 136.00 B
N:   -- changelog: 1525.74 kB
N:   -- control: 144.00 B
N:   -- control-index: 2.38 kB
N:   -- field: 2.06 kB
N:   -- file_info: 339.34 kB
N:   -- index: 2.70 MB
N:   -- is_non_free: 52.00 B
N:   -- java_info: 92.00 B
N:   -- md5sums: 568.13 kB
N:   -- name: 72.00 B
N:   -- native: 64.00 B
N:   -- objdump_info: 92.00 B
N:   -- relation: 1.81 kB
N:   -- scripts: 92.00 B
N:   -- sorted_control-index: 1.56 kB
N:   -- sorted_index: 2.66 MB
N:   -- type: 64.00 B
N: Memory usage [binary:linux-headers-3.2.0-4-amd64/3.2.35-2/amd64]: 6.86 MB
N:   -- base_dir: 148.00 B
N:   -- changelog: 1525.75 kB
N:   -- control: 156.00 B
N:   -- control-index: 3.02 kB
N:   -- field: 1.77 kB
N:   -- file_info: 691.72 kB
N:   -- index: 4.63 MB
N:   -- is_non_free: 52.00 B
N:   -- java_info: 92.00 B
N:   -- md5sums: 761.51 kB
N:   -- name: 80.00 B
N:   -- native: 64.00 B
N:   -- objdump_info: 92.00 B
N:   -- relation: 2.83 kB
N:   -- scripts: 92.00 B
N:   -- sorted_control-index: 2.17 kB
N:   -- sorted_index: 4.55 MB
N:   -- type: 64.00 B
N: Memory usage [binary:xen-linux-system-3.2.0-4-amd64/3.2.35-2/amd64]: 1.50 MB
N:   -- base_dir: 152.00 B
N:   -- changelog: 1525.75 kB
N:   -- control: 160.00 B
N:   -- control-index: 2.38 kB
N:   -- field: 1409.00 B
N:   -- file_info: 815.00 B
N:   -- index: 6.22 kB
N:   -- is_non_free: 52.00 B
N:   -- java_info: 92.00 B
N:   -- md5sums: 454.00 B
N:   -- name: 84.00 B
N:   -- native: 64.00 B
N:   -- objdump_info: 92.00 B
N:   -- relation: 2.22 kB
N:   -- scripts: 92.00 B
N:   -- sorted_control-index: 1.56 kB
N:   -- sorted_index: 5.32 kB
N:   -- type: 64.00 B
N: Memory usage [binary:linux-headers-3.2.0-4-rt-amd64/3.2.35-2/amd64]: 6.85 MB
N:   -- base_dir: 152.00 B
N:   -- changelog: 1525.75 kB
N:   -- control: 160.00 B
N:   -- control-index: 3.02 kB
N:   -- field: 1.78 kB
N:   -- file_info: 704.93 kB
N:   -- index: 4.62 MB
N:   -- is_non_free: 52.00 B
N:   -- java_info: 92.00 B
N:   -- md5sums: 770.92 kB
N:   -- name: 84.00 B
N:   -- native: 64.00 B
N:   -- objdump_info: 92.00 B
N:   -- relation: 2.83 kB
N:   -- scripts: 92.00 B
N:   -- sorted_control-index: 2.17 kB
N:   -- sorted_index: 4.54 MB
N:   -- type: 64.00 B
N: Memory usage [binary:linux-support-3.2.0-4/3.2.35-2/all]: 1.52 MB
N:   -- base_dir: 140.00 B
N:   -- changelog: 1525.74 kB
N:   -- control: 148.00 B
N:   -- control-index: 3.66 kB
N:   -- field: 1.54 kB
N:   -- file_info: 2.98 kB
N:   -- index: 22.93 kB
N:   -- is_non_free: 52.00 B
N:   -- java_info: 92.00 B
N:   -- md5sums: 2.45 kB
N:   -- name: 76.00 B
N:   -- native: 64.00 B
N:   -- objdump_info: 92.00 B
N:   -- relation: 1.50 kB
N:   -- scripts: 625.00 B
N:   -- sorted_control-index: 2.78 kB
N:   -- sorted_index: 21.78 kB
N:   -- type: 64.00 B
N:   -- unpacked: 152.00 B
N: Memory usage [binary:linux-headers-3.2.0-4-all/3.2.35-2/amd64]: 1.50 MB
N:   -- base_dir: 148.00 B
N:   -- changelog: 1525.74 kB
N:   -- control: 156.00 B
N:   -- control-index: 2.38 kB
N:   -- field: 1525.00 B
N:   -- file_info: 800.00 B
N:   -- index: 6.20 kB
N:   -- is_non_free: 52.00 B
N:   -- java_info: 92.00 B
N:   -- md5sums: 444.00 B
N:   -- name: 80.00 B
N:   -- native: 64.00 B
N:   -- objdump_info: 92.00 B
N:   -- relation: 1.52 kB
N:   -- scripts: 92.00 B
N:   -- sorted_control-index: 1.56 kB
N:   -- sorted_index: 5.30 kB
N:   -- type: 64.00 B
N: Memory usage [binary:linux-headers-3.2.0-4-all-amd64/3.2.35-2/amd64]: 1.50 MB
N:   -- base_dir: 152.00 B
N:   -- changelog: 1525.75 kB
N:   -- control: 160.00 B
N:   -- control-index: 2.38 kB
N:   -- field: 1.53 kB
N:   -- file_info: 818.00 B
N:   -- index: 6.23 kB
N:   -- is_non_free: 52.00 B
N:   -- java_info: 92.00 B
N:   -- md5sums: 456.00 B
N:   -- name: 84.00 B
N:   -- native: 64.00 B
N:   -- objdump_info: 92.00 B
N:   -- relation: 2.32 kB
N:   -- scripts: 92.00 B
N:   -- sorted_control-index: 1.56 kB
N:   -- sorted_index: 5.32 kB
N:   -- type: 64.00 B
N: Memory usage [binary:linux-headers-3.2.0-4-common/3.2.35-2/amd64]: 4.46 MB
N:   -- base_dir: 152.00 B
N:   -- changelog: 1525.75 kB
N:   -- control: 160.00 B
N:   -- control-index: 2.38 kB
N:   -- field: 1.53 kB
N:   -- file_info: 374.14 kB
N:   -- index: 2.48 MB
N:   -- is_non_free: 52.00 B
N:   -- java_info: 92.00 B
N:   -- md5sums: 562.20 kB
N:   -- name: 84.00 B
N:   -- native: 64.00 B
N:   -- objdump_info: 92.00 B
N:   -- relation: 1393.00 B
N:   -- scripts: 92.00 B
N:   -- sorted_control-index: 1.56 kB
N:   -- sorted_index: 2.44 MB
N:   -- type: 64.00 B
N: Memory usage [binary:linux-doc-3.2/3.2.35-2/all]: 7.74 MB
N:   -- base_dir: 132.00 B
N:   -- changelog: 1525.73 kB
N:   -- control: 140.00 B
N:   -- control-index: 2.38 kB
N:   -- field: 1.67 kB
N:   -- file_info: 1017.94 kB
N:   -- index: 5.04 MB
N:   -- is_non_free: 52.00 B
N:   -- java_info: 92.00 B
N:   -- md5sums: 1152.63 kB
N:   -- name: 68.00 B
N:   -- native: 64.00 B
N:   -- objdump_info: 92.00 B
N:   -- relation: 1393.00 B
N:   -- scripts: 92.00 B
N:   -- sorted_control-index: 1.56 kB
N:   -- sorted_index: 4.96 MB
N:   -- type: 64.00 B
N:   -- unpacked: 144.00 B
N: Memory usage [binary:linux-libc-dev/3.2.35-2/amd64]: 2.13 MB
N:   -- base_dir: 136.00 B
N:   -- changelog: 1525.74 kB
N:   -- control: 144.00 B
N:   -- control-index: 2.38 kB
N:   -- field: 1.79 kB
N:   -- file_info: 61.68 kB
N:   -- index: 541.39 kB
N:   -- is_non_free: 52.00 B
N:   -- java_info: 92.00 B
N:   -- md5sums: 102.33 kB
N:   -- name: 68.00 B
N:   -- native: 64.00 B
N:   -- objdump_info: 92.00 B
N:   -- relation: 1.83 kB
N:   -- scripts: 92.00 B
N:   -- sorted_control-index: 1.56 kB
N:   -- sorted_index: 531.17 kB
N:   -- type: 64.00 B
N:   -- unpacked: 144.00 B
N: Memory usage [binary:linux-headers-3.2.0-4-common-rt/3.2.35-2/amd64]: 4.48 MB
N:   -- base_dir: 152.00 B
N:   -- changelog: 1525.75 kB
N:   -- control: 160.00 B
N:   -- control-index: 2.38 kB
N:   -- field: 1.53 kB
N:   -- file_info: 384.59 kB
N:   -- index: 2.50 MB
N:   -- is_non_free: 52.00 B
N:   -- java_info: 92.00 B
N:   -- md5sums: 573.07 kB
N:   -- name: 84.00 B
N:   -- native: 64.00 B
N:   -- objdump_info: 92.00 B
N:   -- relation: 1393.00 B
N:   -- scripts: 92.00 B
N:   -- sorted_control-index: 1.56 kB
N:   -- sorted_index: 2.46 MB
N:   -- type: 64.00 B
N: Memory usage [binary:linux-source-3.2/3.2.35-2/all]: 1.51 MB
N:   -- base_dir: 136.00 B
N:   -- changelog: 1525.74 kB
N:   -- control: 144.00 B
N:   -- control-index: 2.38 kB
N:   -- field: 2.11 kB
N:   -- file_info: 1237.00 B
N:   -- index: 9.39 kB
N:   -- is_non_free: 52.00 B
N:   -- java_info: 92.00 B
N:   -- md5sums: 904.00 B
N:   -- name: 72.00 B
N:   -- native: 64.00 B
N:   -- objdump_info: 92.00 B
N:   -- relation: 5.32 kB
N:   -- scripts: 92.00 B
N:   -- sorted_control-index: 1.56 kB
N:   -- sorted_index: 8.42 kB
N:   -- type: 64.00 B

Reply to: