Bug#664794: lintian: should we compress some collections (file-info and index)?
On 2012-03-20 23:25, Russ Allbery wrote:
> Niels Thykier <niels@thykier.net> writes:
>
>> I have been considering if it would be a good idea to (conditionally?)
>> compress certain collection files. In some cases they are actually
>> rather large and I suspect compression will generally be good in such
>> cases[1]. Admittedly, there are also cases where it gives little to no
>> size reduction.
>
> Compressing some stuff is not a bad idea. The indices and file-info
> collections seem like the most obvious targets. People doing greps can
> switch to zgreps.
>
True, but it kind of implies that they are aware of changes we make in
the Lab. :)
> I would prefer to never conditionally compress anything; either always
> compress it or never compress it. That way, the file names and access
> method are always consistent.
>
Originally I had thought of reusing _open_data_file (from harness) to
access the file(s). But I do see a point in making the access
consistent (especially for people doing "grep -r" checks).
Though it leaves the question of how to migrate from uncompressed to
compressed. If we do "compressed"-only we have to do a full run (or a
find -name | xargs gzip). I guess that is reasonable to do, we just
need to tell people maintaining lintian.$domain.$tld to do the same.
Alternatively, we can bump the version of these collections and have
Lintian slowly migrate as packages are (re-checked), but that means the
(non-Lintian) access will be inconsistent until all packages have been
re-checked.
~Niels
Reply to: