[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: detailed lists with archive contents - more than just Contents

On Thu, Feb 21, 2013 at 07:05:08PM +0100, Paul Gevers wrote:
> This indeed looks very useful. However, I don't think it is really
> useful to trigger on common changelog and copyright files from the same
> source package as they indeed usually are the same, which is fine of course.

Answering this one since it has come up a number of times now. The
mentioning of duplicate changelogs is kind of intentional. This is more
of a diagnostic tool than highlighting errors. In fact there is nothing
inherently wrong with shipping the same file in different packages. It
might be an opportunity to easily save some bytes, but maybe not. dedup
just presents the raw numbers. If your duplicated changelog makes up
half of the package, then maybe a -common could help here? Reading these
numbers clearly needs some experience and I am just starting to gain
that experience. To me the attempt to filter out common cases appears to
be highly prone of false negatives. My current rule of thumb goes like:
If the sharing is less than 1MB or less than 10% it is probably not
worth looking at. And that pretty much filters out 90% of the changelogs
as well. That said other queries might be interesting to look at, but I
would like to avoid complex rules.


Reply to: