On Wednesday, July 08 2020, Felix Lechner wrote: > Hi Sergio, Hey Felix, > On Wed, Jun 10, 2020 at 9:57 AM Sergio Durigan Junior > <sergiodj@debian.org> wrote: >> >> after calling Text::ParseWords::parse_line, we check to >> see if the first package name has a comma as the last char. If it does, >> then we assume that there will be at least one other package name >> listed, and advance an index. When we reach a package name whose last >> char is not a comma, we then assume that the next field is the manpage >> section number. > > Something in that patch is not quite working. I previously added a > safeguard for an undefined value warning, but that was not enough: > > https://salsa.debian.org/lintian/lintian/-/commit/d3c64d8ab40de6e38c96334e2515550df1957a4a > > In an archive-wide run, the modified patch still produced the warnings > below. I show the complete list for the record, and not to intimidate > anyone. It's no big deal. > > You may want to check out kde-dev-scripts, which generated a lot of warnings. Ouch, thanks for the report, and sorry about the breakage. My Perl-foo is very limited. So, I think I found the problem, and I have a possible solution. Apparently, some manpages have malformed .TH headers, and Perl's Text::ParseWords::parse_line doesn't cope well with them. For example, kde-dev-scripts's /usr/share/man/ca/man1/create_makefiles.1.gz file has: .TH "\FBCREATE_MAKEFILES\" "1" "8 de mar\(,c del 2003" "[FIXME: source]" "[FIXME: manual]" Not pretty, huh? If we make simple Perl program to try to parse this line: use Text::ParseWords; @words = parse_line('\s+', 0, q{.TH "\FBCREATE_MAKEFILES\" "1" "8 de mar\(,c del 2003" "[FIXME: source]" "[FIXME: manual]"}); $i = 0; foreach (@words) { print "$i: <$_>\n"; $i++; } and run it, you will noticed that it doesn't return anything! Now, if we tweak the line a little bit, by removing some of the backslashes for example, we start getting somewhere: ... @words = parse_line('\s+', 0, q{.TH "FBCREATE_MAKEFILES" "1" "8 de mar\(,c del 2003" "[FIXME: source]" "[FIXME: manual]"}); ... Now run it: $ perl parseline.pl 0: <.TH> 1: <FBCREATE_MAKEFILES> 2: <1> 3: <8 de mar(,c del 2003> 4: <[FIXME: source]> 5: <[FIXME: manual]> So yeah, there's a problem here. I honestly don't feel like spending too much time investigating Perl's internals, so I think it's possible to detect when parse_line failed and act accordingly. I'm attaching a patch that does just that, and prevents the warnings/failures from happening. The idea is to check whether the size of the @th_fields array is bigger than 0, and just perform the checks if they are. I also took the liberty to remove the // EMPTY part, because it shouldn't be necessary anymore. What do you think? -- Sergio GPG key ID: 237A 54B1 0287 28BF 00EF 31F4 D0EB 7628 65FC 5E36 Please send encrypted e-mail if possible https://sergiodj.net/ diff --git a/checks/documentation/manual.pm b/checks/documentation/manual.pm index b30bf6081..350dee927 100644 --- a/checks/documentation/manual.pm +++ b/checks/documentation/manual.pm @@ -297,21 +297,23 @@ sub files { next if $line =~ /^\.\\\"/; # comments .\" if ($line =~ /^\.TH\s/) { # header my @th_fields= Text::ParseWords::parse_line('\s+', 0, $line); - my $pkgname_idx = 1; - # Iterate over possible package names. If there is - # more than one, they will be separated by a comma and - # a whitespace. In case we find the comma, we advance - # $pkgname_idx. - while ((substr($th_fields[$pkgname_idx], -1) // EMPTY) eq ','){ - $pkgname_idx++; - } - # We're now at the last package, so we should be able - # to obtain the manpage section number by incrementing - # 1 to the index. - my $th_section = $th_fields[++$pkgname_idx]; - if ($th_section && (lc($fn_section) ne lc($th_section))) { - $self->tag('wrong-manual-section', - "$file:$lc $fn_section != $th_section"); + if ($#th_fields > 0) { + my $pkgname_idx = 1; + # Iterate over possible package names. If there is + # more than one, they will be separated by a comma and + # a whitespace. In case we find the comma, we advance + # $pkgname_idx. + while ((substr($th_fields[$pkgname_idx], -1)) eq ','){ + $pkgname_idx++; + } + # We're now at the last package, so we should be able + # to obtain the manpage section number by incrementing + # 1 to the index. + my $th_section = $th_fields[++$pkgname_idx]; + if ($th_section && (lc($fn_section) ne lc($th_section))) { + $self->tag('wrong-manual-section', + "$file:$lc $fn_section != $th_section"); + } } } if ( ($line =~ m,(/usr/(dict|doc|etc|info|man|adm|preserve)/),)
Attachment:
signature.asc
Description: PGP signature