On Wednesday, July 08 2020, Felix Lechner wrote:
> Hi Sergio,
Hey Felix,
> On Wed, Jun 10, 2020 at 9:57 AM Sergio Durigan Junior
> <sergiodj@debian.org> wrote:
>>
>> after calling Text::ParseWords::parse_line, we check to
>> see if the first package name has a comma as the last char. If it does,
>> then we assume that there will be at least one other package name
>> listed, and advance an index. When we reach a package name whose last
>> char is not a comma, we then assume that the next field is the manpage
>> section number.
>
> Something in that patch is not quite working. I previously added a
> safeguard for an undefined value warning, but that was not enough:
>
> https://salsa.debian.org/lintian/lintian/-/commit/d3c64d8ab40de6e38c96334e2515550df1957a4a
>
> In an archive-wide run, the modified patch still produced the warnings
> below. I show the complete list for the record, and not to intimidate
> anyone. It's no big deal.
>
> You may want to check out kde-dev-scripts, which generated a lot of warnings.
Ouch, thanks for the report, and sorry about the breakage. My Perl-foo
is very limited.
So, I think I found the problem, and I have a possible solution.
Apparently, some manpages have malformed .TH headers, and Perl's
Text::ParseWords::parse_line doesn't cope well with them. For example,
kde-dev-scripts's /usr/share/man/ca/man1/create_makefiles.1.gz file has:
.TH "\FBCREATE_MAKEFILES\" "1" "8 de mar\(,c del 2003" "[FIXME: source]" "[FIXME: manual]"
Not pretty, huh? If we make simple Perl program to try to parse this
line:
use Text::ParseWords;
@words = parse_line('\s+', 0, q{.TH "\FBCREATE_MAKEFILES\" "1" "8 de mar\(,c del 2003" "[FIXME: source]" "[FIXME: manual]"});
$i = 0;
foreach (@words) {
print "$i: <$_>\n";
$i++;
}
and run it, you will noticed that it doesn't return anything! Now, if
we tweak the line a little bit, by removing some of the backslashes for
example, we start getting somewhere:
...
@words = parse_line('\s+', 0, q{.TH "FBCREATE_MAKEFILES" "1" "8 de mar\(,c del 2003" "[FIXME: source]" "[FIXME: manual]"});
...
Now run it:
$ perl parseline.pl
0: <.TH>
1: <FBCREATE_MAKEFILES>
2: <1>
3: <8 de mar(,c del 2003>
4: <[FIXME: source]>
5: <[FIXME: manual]>
So yeah, there's a problem here. I honestly don't feel like spending
too much time investigating Perl's internals, so I think it's possible
to detect when parse_line failed and act accordingly.
I'm attaching a patch that does just that, and prevents the
warnings/failures from happening. The idea is to check whether the size
of the @th_fields array is bigger than 0, and just perform the checks if
they are. I also took the liberty to remove the // EMPTY part, because
it shouldn't be necessary anymore.
What do you think?
--
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF 31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
https://sergiodj.net/
diff --git a/checks/documentation/manual.pm b/checks/documentation/manual.pm
index b30bf6081..350dee927 100644
--- a/checks/documentation/manual.pm
+++ b/checks/documentation/manual.pm
@@ -297,21 +297,23 @@ sub files {
next if $line =~ /^\.\\\"/; # comments .\"
if ($line =~ /^\.TH\s/) { # header
my @th_fields= Text::ParseWords::parse_line('\s+', 0, $line);
- my $pkgname_idx = 1;
- # Iterate over possible package names. If there is
- # more than one, they will be separated by a comma and
- # a whitespace. In case we find the comma, we advance
- # $pkgname_idx.
- while ((substr($th_fields[$pkgname_idx], -1) // EMPTY) eq ','){
- $pkgname_idx++;
- }
- # We're now at the last package, so we should be able
- # to obtain the manpage section number by incrementing
- # 1 to the index.
- my $th_section = $th_fields[++$pkgname_idx];
- if ($th_section && (lc($fn_section) ne lc($th_section))) {
- $self->tag('wrong-manual-section',
- "$file:$lc $fn_section != $th_section");
+ if ($#th_fields > 0) {
+ my $pkgname_idx = 1;
+ # Iterate over possible package names. If there is
+ # more than one, they will be separated by a comma and
+ # a whitespace. In case we find the comma, we advance
+ # $pkgname_idx.
+ while ((substr($th_fields[$pkgname_idx], -1)) eq ','){
+ $pkgname_idx++;
+ }
+ # We're now at the last package, so we should be able
+ # to obtain the manpage section number by incrementing
+ # 1 to the index.
+ my $th_section = $th_fields[++$pkgname_idx];
+ if ($th_section && (lc($fn_section) ne lc($th_section))) {
+ $self->tag('wrong-manual-section',
+ "$file:$lc $fn_section != $th_section");
+ }
}
}
if ( ($line =~ m,(/usr/(dict|doc|etc|info|man|adm|preserve)/),)
Attachment:
signature.asc
Description: PGP signature