[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#795641: lintian: [PATCH] fix common license false positives in new-style copyright files



Control: tags -1 - moreinfo

On 2015-08-16 00:13, Rafael Kitover wrote:
> Package: lintian
> Version: 2.5.35
> Severity: normal
> 
> This patch fixes lack-of-common-license-reference false positives in
> new-style copyright files when a license refers to one of these common
> licenses.
> 
> It includes a test for this against all common licenses.
> 
> All testsuite tests pass.
> 
> The patch is against the debcheckout of lintian.
> 
> [...]
> 

Hi,

Thanks for looking in to this.

I do have some concerns with the implementation, see below interleaved.

> 0001-fix-common-lic.-false-pos.-in-new-style-copyright.patch
> 
> 
>>From 441f44c5be0fe70d9a86b3ee0cec49430b0c2a9d Mon Sep 17 00:00:00 2001
> From: Rafael Kitover <rkitover@gmail.com>
> Date: Sat, 15 Aug 2015 17:50:34 -0400
> Subject: [PATCH] fix common lic. false pos. in new-style copyright
> 
> Fix false positives for lack of common license references in new-style
> copyright files when a license refers to another license, by trying to
> parse the file and then checking both the names of the licenses and the
> texts.
> 
> Add new test for references to common licenses as well.
> 
> The test suite passes with these changes.
> ---
>  [...]
> 
> diff --git a/checks/copyright-file.pm b/checks/copyright-file.pm
> index c6e35ef..09b664e 100644
> --- a/checks/copyright-file.pm
> +++ b/checks/copyright-file.pm
> [...]
> @@ -373,6 +426,45 @@ sub check_cross_link {
>      return;
>  }
>  
> +# Checks the name and text of every license in the file against given name and
> +# text check coderefs, if the file is in the new format, if the file is in the
> +# old format only runs the text coderef against the whole file.
> +sub check_names_texts {

This is called multiple times.  Each time it opens and parses the
copyright file at least once each.

I would like to see this done smarter and less wastefully.  Please note
that we have some very large copyright files or/and source packages
producing many binaries with a copyright file in each of them.

Have you looked at the performance of this check on some packages?
Lintian has a few built-in options to help you here:

$ lintian --help=extended | grep perf
    --perf-debug              turn on performance debugging
    --perf-output X           send performance logging to file (...)


> +    my ($name_check, $text_check, $file) = @_;
> +
> +    local $@;
> +    eval {
> +        foreach my $paragraph (read_dpkg_control($file)) {
                                  ^^^^^^^^^^^^^^^^^^^^^^^^

(First read of the file.)

> +            next
> +              unless (keys %$paragraph == 1)
> +              && ((keys %$paragraph)[0] =~ /^license$/i);
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Please use "exists($paragraph->{'license'})" instead.

> +
> +            my ($license_name, $license_text)
> +              = (values %$paragraph)[0] =~ /^([^\r\n]+)\r?\n(.*)\z/s;
> +
> +            my $matches = do {
> +                local $_ = $license_name || '';
> +                $name_check->($_);

$name_check is always a single regex.  Consider passing it as a qr// and
do a normal regex check here instead of calling a separate sub.  This
would also avoid the "local $_", which is somewhat expensive.

> +              }
> +              && do {
> +                local $_ = $license_text || '';
> +                $text_check->($_);
> +              };

If we could avoid the subroutine + local $_; here too that would be nice
as well.

> +
> +            die 'MATCH' if $matches;
> +        }
> +    };
> +    if ($@)
> +    { # match or parse error: copyright not in new format, just check text
> +        return 1 if $@ =~ /^MATCH/;
> +
> +        local $_ = slurp_entire_file($file);
                      ^^^^^^^^^^^^^^^^^^^^^^^^
(Second read of the file.)

> +        return $text_check->($_);
> +    }
> +
> +    return; # did not match anything
> +}
> +
>  1;
>  
>  # Local Variables:
> diff --git a/t/tests/copyright-file-non-common-license/debian/debian/control.in b/t/tests/copyright-file-non-common-license/debian/debian/control.in
> new file mode 100644
> index 0000000..e80822d
> --- /dev/null
> +++ b/t/tests/copyright-file-non-common-license/debian/debian/control.in
> @@ -0,0 +1,94 @@
> +Source: {$source}
> +Priority: extra
> +Section: {$section}
> +Maintainer: {$author}
> +Standards-Version: {$standards_version}
> +Build-Depends: debhelper (>= 9)
> +
> +Package: copyright-mentions-apache
> +Architecture: all
> +Depends: $\{shlibs:Depends\}, $\{misc:Depends\}
> +Description: checks against common license false positive for Apache
> + Tests against common license false positive for Apache.
> + .
> + This is a test package designed to exercise some feature or tag of
> + Lintian.  It is part of the Lintian test suite and may do very odd
> + things.  It should not be installed like a regular package.  It may
> + be an empty package.
> +
> +Package: copyright-mentions-apache2
> +Architecture: all
> +Depends: $\{shlibs:Depends\}, $\{misc:Depends\}
> +Description: checks against common license false positive for Apache (2)
> + Tests against common license false positive for Apache (2).
> + .
> + This is a test package designed to exercise some feature or tag of
> + Lintian.  It is part of the Lintian test suite and may do very odd
> + things.  It should not be installed like a regular package.  It may
> + be an empty package.
> +
> +Package: copyright-mentions-apache3
> +Architecture: all
> +Depends: $\{shlibs:Depends\}, $\{misc:Depends\}
> +Description: checks against common license false positive for Apache (3)
> + Tests against common license false positive for Apache (3).
> + .
> + This is a test package designed to exercise some feature or tag of
> + Lintian.  It is part of the Lintian test suite and may do very odd
> + things.  It should not be installed like a regular package.  It may
> + be an empty package.
> +
> +Package: copyright-mentions-gfdl
> +Architecture: all
> +Depends: $\{shlibs:Depends\}, $\{misc:Depends\}
> +Description: checks against common license false positive for gfdl
> + Tests against common license false positive for gfdl.
> + .
> + This is a test package designed to exercise some feature or tag of
> + Lintian.  It is part of the Lintian test suite and may do very odd
> + things.  It should not be installed like a regular package.  It may
> + be an empty package.
> +
> +Package: copyright-mentions-gpl
> +Architecture: all
> +Depends: $\{shlibs:Depends\}, $\{misc:Depends\}
> +Description: checks against common license false positive for gpl
> + Tests against common license false positive for gpl.
> + .
> + This is a test package designed to exercise some feature or tag of
> + Lintian.  It is part of the Lintian test suite and may do very odd
> + things.  It should not be installed like a regular package.  It may
> + be an empty package.
> +
> +Package: copyright-mentions-lgpl
> +Architecture: all
> +Depends: $\{shlibs:Depends\}, $\{misc:Depends\}
> +Description: checks against common license false positive for lgpl
> + Tests against common license false positive for lgpl.
> + .
> + This is a test package designed to exercise some feature or tag of
> + Lintian.  It is part of the Lintian test suite and may do very odd
> + things.  It should not be installed like a regular package.  It may
> + be an empty package.
> +
> +Package: copyright-mentions-lgpl2
> +Architecture: all
> +Depends: $\{shlibs:Depends\}, $\{misc:Depends\}
> +Description: checks against common license false positive for lgpl (2)
> + Tests against common license false positive for lgpl (2).
> + .
> + This is a test package designed to exercise some feature or tag of
> + Lintian.  It is part of the Lintian test suite and may do very odd
> + things.  It should not be installed like a regular package.  It may
> + be an empty package.
> +
> +Package: copyright-mentions-perl
> +Architecture: all
> +Depends: $\{shlibs:Depends\}, $\{misc:Depends\}
> +Description: checks against common license false positive for perl
> + Tests against common license false positive for perl.
> + .
> + This is a test package designed to exercise some feature or tag of
> + Lintian.  It is part of the Lintian test suite and may do very odd
> + things.  It should not be installed like a regular package.  It may
> + be an empty package.
> diff --git a/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-apache.copyright b/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-apache.copyright
> new file mode 100644
> index 0000000..b4fe499
> --- /dev/null
> +++ b/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-apache.copyright
> @@ -0,0 +1,13 @@
> +Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
> +Upstream-Name: lintian
> +Upstream-Contact: Lintian Maintainers <debian-lint-maint@lists.debian.org>
> +Source: http://git.debian.org/?p=lintian/lintian.git
> +
> +Files: *
> +Copyright: 2015 Lintian Maintainers <debian-lint-maint@lists.debian.org>
> +License: Mentions-Other-License
> +
> +License: Mentions-Other-License
> + lorem ipsum
> + lorem ipsum Apache License , Version 2.0 lorem ipsum
> + lorem ipsum
> diff --git a/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-apache2.copyright b/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-apache2.copyright
> new file mode 100644
> index 0000000..035ee22
> --- /dev/null
> +++ b/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-apache2.copyright
> @@ -0,0 +1,13 @@
> +Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
> +Upstream-Name: lintian
> +Upstream-Contact: Lintian Maintainers <debian-lint-maint@lists.debian.org>
> +Source: http://git.debian.org/?p=lintian/lintian.git
> +
> +Files: *
> +Copyright: 2015 Lintian Maintainers <debian-lint-maint@lists.debian.org>
> +License: Mentions-Other-License
> +
> +License: Mentions-Other-License
> + lorem ipsum
> + lorem ipsum Apache License  Version 2.0 lorem ipsum
> + lorem ipsum
> diff --git a/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-apache3.copyright b/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-apache3.copyright
> new file mode 100644
> index 0000000..dab0d47
> --- /dev/null
> +++ b/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-apache3.copyright
> @@ -0,0 +1,13 @@
> +Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
> +Upstream-Name: lintian
> +Upstream-Contact: Lintian Maintainers <debian-lint-maint@lists.debian.org>
> +Source: http://git.debian.org/?p=lintian/lintian.git
> +
> +Files: *
> +Copyright: 2015 Lintian Maintainers <debian-lint-maint@lists.debian.org>
> +License: Mentions-Other-License
> +
> +License: Mentions-Other-License
> + lorem ipsum
> + lorem ipsum Apache-2 License lorem ipsum
> + lorem ipsum
> diff --git a/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-gfdl.copyright b/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-gfdl.copyright
> new file mode 100644
> index 0000000..5a8f46b
> --- /dev/null
> +++ b/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-gfdl.copyright
> @@ -0,0 +1,13 @@
> +Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
> +Upstream-Name: lintian
> +Upstream-Contact: Lintian Maintainers <debian-lint-maint@lists.debian.org>
> +Source: http://git.debian.org/?p=lintian/lintian.git
> +
> +Files: *
> +Copyright: 2015 Lintian Maintainers <debian-lint-maint@lists.debian.org>
> +License: Mentions-Other-License
> +
> +License: Mentions-Other-License
> + lorem ipsum
> + lorem ipsum GNU Free Documentation License (GFDL) lorem ipsum
> + lorem ipsum
> diff --git a/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-gpl.copyright b/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-gpl.copyright
> new file mode 100644
> index 0000000..248debf
> --- /dev/null
> +++ b/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-gpl.copyright
> @@ -0,0 +1,14 @@
> +Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
> +Upstream-Name: lintian
> +Upstream-Contact: Lintian Maintainers <debian-lint-maint@lists.debian.org>
> +Source: http://git.debian.org/?p=lintian/lintian.git
> +
> +Files: *
> +Copyright: 2015 Lintian Maintainers <debian-lint-maint@lists.debian.org>
> +License: Mentions-Other-License
> +
> +License: Mentions-Other-License
> + lorem ipsum
> + lorem ipsum GNU General Public License (GPL) applies to the changes,
> + .
> + lorem ipsum
> diff --git a/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-lgpl.copyright b/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-lgpl.copyright
> new file mode 100644
> index 0000000..89c5e79
> --- /dev/null
> +++ b/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-lgpl.copyright
> @@ -0,0 +1,13 @@
> +Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
> +Upstream-Name: lintian
> +Upstream-Contact: Lintian Maintainers <debian-lint-maint@lists.debian.org>
> +Source: http://git.debian.org/?p=lintian/lintian.git
> +
> +Files: *
> +Copyright: 2015 Lintian Maintainers <debian-lint-maint@lists.debian.org>
> +License: Mentions-Other-License
> +
> +License: Mentions-Other-License
> + lorem ipsum
> + lorem ipsum GNU Lesser General Public License (LGPL) lorem ipsum
> + lorem ipsum
> diff --git a/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-lgpl2.copyright b/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-lgpl2.copyright
> new file mode 100644
> index 0000000..291c0e6
> --- /dev/null
> +++ b/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-lgpl2.copyright
> @@ -0,0 +1,13 @@
> +Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
> +Upstream-Name: lintian
> +Upstream-Contact: Lintian Maintainers <debian-lint-maint@lists.debian.org>
> +Source: http://git.debian.org/?p=lintian/lintian.git
> +
> +Files: *
> +Copyright: 2015 Lintian Maintainers <debian-lint-maint@lists.debian.org>
> +License: Mentions-Other-License
> +
> +License: Mentions-Other-License
> + lorem ipsum
> + lorem ipsum GNU Library General Public License (LGPL) lorem ipsum
> + lorem ipsum
> diff --git a/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-perl.copyright b/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-perl.copyright
> new file mode 100644
> index 0000000..b2c896a
> --- /dev/null
> +++ b/t/tests/copyright-file-non-common-license/debian/debian/copyright-mentions-perl.copyright
> @@ -0,0 +1,13 @@
> +Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
> +Upstream-Name: lintian
> +Upstream-Contact: Lintian Maintainers <debian-lint-maint@lists.debian.org>
> +Source: http://git.debian.org/?p=lintian/lintian.git
> +
> +Files: *
> +Copyright: 2015 Lintian Maintainers <debian-lint-maint@lists.debian.org>
> +License: Mentions-Other-License
> +
> +License: Mentions-Other-License
> + lorem ipsum
> + lorem ipsum under the same terms as Perl itself lorem ipsum
> + lorem ipsum
> diff --git a/t/tests/copyright-file-non-common-license/desc b/t/tests/copyright-file-non-common-license/desc
> new file mode 100644
> index 0000000..c652192
> --- /dev/null
> +++ b/t/tests/copyright-file-non-common-license/desc
> @@ -0,0 +1,12 @@
> +Testname: copyright-file-non-common-license
> +Sequence: 6000
> +Version: 1.0
> +Description: Test for false positive for a common license
> +Skeleton: pedantic
> +Options: -IE --pedantic
> +Test-Against:
> + copyright-should-refer-to-common-license-file-for-gpl
> + copyright-should-refer-to-common-license-file-for-gfdl
> + copyright-should-refer-to-common-license-file-for-lgpl
> + copyright-should-refer-to-common-license-file-for-apache-2
> + copyright-file-lacks-pointer-to-perl-license
> diff --git a/t/tests/copyright-file-non-common-license/tags b/t/tests/copyright-file-non-common-license/tags
> new file mode 100644
> index 0000000..e69de29
> -- 2.1.4
> 


Reply to: