On Thu, Nov 27, 2014 at 10:11 PM, Jonas Smedegaard <dr@jones.dk> wrote: > Quoting Bastien ROUCARIES (2014-11-27 21:42:22) >> Le 27 nov. 2014 14:00, "Jonas Smedegaard" <[1]dr@jones.dk> a écrit : >>> A graphics file popular to include e.g. in testsuites is a scanning >>> from November 1972 issue of Playboy, which is not DFSG compliant: >>> <[2]https://en.wikipedia.org/wiki/File:Lenna.png#Licensing>. >> >>I have opened #760171 also some time agi > > Ahh - I was wondering if others had looked at this already - then forgot > to check. Thanks! > > >>> See bug#771126 for a concrete example. I have tried our codesearch to >>> discover more, and have found some but failed to find a reliable search >>> pattern using that interface. >>> >>> It is used for color calibration and therefore needed in non-lossy >>> format but can vary in size and serialization, so simple md5 detection >>> is inefficient, I suspect. >> >> No md5 do not suffer from false postive so it is an autoreject. >> >> Could you send me a patch against data/cruft/non-distributable-files ? >> >> Less work for ftpmaster particularly whenbautomatocally done is faster >> package going to main. > > Good points. > > I'd be happy to contribute md5sums - but this is my first bugreport > against lintian so I need a bit of hand holding, I guess: Is that path > perhaps in some git, or how do I contribute? > it is the path in lintian source of the non distribuable file. Usually I use the script joined use it like this lintiannonfree "non distributable from playboy" "see bug #666666" Pere.pdf 078c12e4cdd424b6927e6d281a7284f0 ~~ 1b1b8e872119980560bf98d90784ac570d9d1053 ~~ c92fc85669c31db6547b8cb73dcef6c9def421f4623e6b69099a98b3391222fa ~~ Pere.pdf ~~ non distributable from playboy ~~ see bug #666666 >>> Filename typically includes "lena" or "lenna" in its stem. >>> >>> A simple check could be scan for /\blenn?a\b/ in filename, and then >>> check with "file" if content is a graphics file. >> >> Yes will implement but it will be wild guess no autoreject. > > Right, I had no higher hopes than that (as you can see above I even > outruled md5sums, didn't think both could be done in parallel) Yes both could be do in parallel > >>> More reliable check might involve serializing hits of above loose check >>> in some deterministic manner, compute md5 from that, and compare against >>> a blacklist. >> >> No pull too many deps > > Ok. Wasn't sure if other checks alreadt did similar. Was just throwing > ideas :-) Thanks for your work > > - Jonas > > -- > * Jonas Smedegaard - idealist & Internet-arkitekt > * Tlf.: +45 40843136 Website: http://dr.jones.dk/ > > [x] quote me freely [ ] ask before reusing [ ] keep private
Attachment:
lintiannonfree
Description: Binary data