Re: Several copyrights wrong in R packages
> On 2021-10-21 02:18, Charles Plessy wrote:
> > can routine-update grep the diff with the previous upstream release
> > using patterns such as license, author, copyright, (c), ©, etc, and emit
> > a warning if there is a match ? This is what I used to do by hand when
> > I had no time or interest to inspect the whole diff for other changes.
Le Thu, Oct 21, 2021 at 11:15:39AM +0300, Andrius Merkys a écrit :
>
> Maybe licensecheck could be of any use here?
Hi all,
I have a prototype.
Git allows an external diff command to be run, using the `GIT_EXTERNAL_DIFF`
environment variable. Here is a quote from its manual page:
> GIT_EXTERNAL_DIFF
>
> When the environment variable GIT_EXTERNAL_DIFF is set, the program
> named by it is called to generate diffs, and Git does not use its
> builtin diff machinery. For a path that is added, removed, or
> modified, GIT_EXTERNAL_DIFF is called with 7 parameters:
>
> path old-file old-hex old-mode new-file new-hex new-mode
So I wrote a small command that runs licensecheck on the old and new
file, and diffs the results. I have to pipe them through sed because
at least one of the files is in a temporary folder, whose name is
included in the output of licensecheck.
$ cat diff-licensecheck
#!/bin/bash
diff -u <(licensecheck $2 | sed s,$2,$1, ) <(licensecheck $5 | sed s,$5,$1,)
exit 0
Then, I created a test git repository in which I commited one file
detected as under the Apache License, then replaced its contents by
something detected as under the BSD license the next commmmit.
And voilà
$ GIT_EXTERNAL_DIFF=./diff-licensecheck git diff HEAD^
--- /dev/fd/63 2021-10-21 21:35:27.071663271 +0900
+++ /dev/fd/62 2021-10-21 21:35:27.071663271 +0900
@@ -1 +1 @@
-testfile: *No copyright* Apache License 2.0
+testfile: BSD 3-clause "New" or "Revised" License
I then tested it on the `upstream` branch of one of our source packages
and... licensecheck is really too slow...
I tested the alternative ninka and it is much faster, but might have
more false positives. For instance in r-cran-testthat, the NEWS.md
file pops up:
git diff 5e69a2ce025417258572a9322e4b720f8ab389a2^ 5e69a2ce025417258572a9322e4b720f8ab389a2
--- /dev/fd/63 2021-10-21 21:44:31.377404307 +0900
+++ /dev/fd/62 2021-10-21 21:44:31.377404307 +0900
@@ -1 +1 @@
-NEWS.md;SeeFile,SeeFile,SeeFile;3;3;0;1;2;1,1,IntelPart08,UNKNOWN,UNKNOWN,1
+NEWS.md;SeeFile,SeeFile;2;2;0;1;3;UNKNOWN,1,1,IntelPart08,UNKNOWN,UNKNOWN
Another thing is that new files or deleted files obviously pop up a
diff, but it must be easy to suppress that by not running the diff
command if either the old or the new file is empty or /dev/null.
That is all from me today, now it is time to sleep !
Cheers,
Charles
--
Charles Plessy Nagahama, Yomitan, Okinawa, Japan
Debian Med packaging team http://www.debian.org/devel/debian-med
Tooting from work, https://mastodon.technology/@charles_plessy
Tooting from home, https://framapiaf.org/@charles_plessy
Reply to: