[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Help needed to debug a failing bot on i18n.debian.org



[ Thanks for the Cc, I'm indeed not subscribed to -i18n. ]

Christian PERRIER <bubulle@debian.org> (2013-10-01):
> So, it seems that:
> - the virtual machine doesn't have that much memory (2GB)
> - it doesnt have much swap
> - clamd is eating a lot of memory
> 
> clamd seems to be running for 17 days, about a week after we started
> to have some issues with statistics.
> 
> If I had root access to this machine, I would: 
> - restart clamd
> - add more swap
> - eventually add more memory

The machine is quite limited in RAM, yeah, and maybe clamd shouldn't be
using that much memory, but that might only work around the issue for a
given number of days (see analysis below).

> Anyway, I applied your patch and we'll see what happens

I might have mentioned on #debian-i18n, or maybe only to David that my
test had been running for a while when I posted my patches, and it made
it to the end. :)


Now, looking into what happens:
 - dl10n-check looks at source packages, and creates a $deb object by
   reading thanks to parse_tarball; its type is Debian::Pkg::DebSrc,
   built on top of Debian::Pkg::Tar, which is an *in-memory* tar
   processor; see its description:
   ""This package is the base class for all C<Debian::Pkg> classes.
     Unlike most tar processors, this one does perform all operations
     in memory, but retrieves only specified files, so it should not
     consume too much memory if you are specific enough.""
 - its implementation consists of opening the file for decompression
   through: "{gzip,bzip2,xz} -dc $file |". That one explodes for 0a-data
   with its 450 MB xz archive (1.1 GB uncompressed, not fitting into 2
   GB RAM!), and error handling is poor.

I'm wondering whether the following wouldn't be better:
 - use the nifty "dpkg-source -x" to inspect the source package; at the
   moment, it doesn't seem to support dpkg-deb's --fsys-tarfile which
   could have been used to pipe contents to tar, where filtering would
   happen. Since "dpkg-source -x" is merely a wrapper for "extract" in
   the Dpkg::Source::Package module, I guess one could implement an
   option into that module, which would pass it on to the relevant
   package format handler (Dpkg::Source::Package::*), to only unpack
   the files which would be specified.
 - since the various search_* subs in dl10n-check contain some file
   patterns, all those could be passed to the said option, so that
   dpkg-source -x only deals which the files one cares about.
 - that also means you get support for all dpkg formats for free
   (think multitarballs for 3.0 formats).

Not sure I'm going to be the one trying to PoC-ify it though. :/

Mraw,
KiBi.

Attachment: signature.asc
Description: Digital signature


Reply to: