Bug#459871: hyphen-used-as-minus-sign false positives
Package: lintian
Version: 1.23.42
Severity: minor
Tags: patch
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi
For non-english text, hyphen-used-as-minus-sign produces quite a lot of
false positives. For example "má-li" or "Není-li" is correct Czech, but
it triggers hyphen-used-as-minus-sign. There are a lot of more of such
words. It seems to be caused by non-ascii character before -.
I tried to debug it on manpages-cs and try to find best regexp to catch
common mistakes and still not catching Czech words. I'm not sure if it
is ready to be merged, but for me it produces much better results. You
will find patch for my change attached.
- --
Michal Čihař | http://cihar.com | http://blog.cihar.com
- -- System Information:
Debian Release: lenny/sid
APT prefers unstable
APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: i386 (i686)
Kernel: Linux 2.6.23-1-686 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages lintian depends on:
ii binutils 2.18.1~cvs20080103-1 The GNU assembler, linker and bina
ii diffstat 1.45-2 produces graph of changes introduc
ii dpkg-dev 1.14.15 package building tools for Debian
ii file 4.21-4 Determines file type using "magic"
ii gettext 0.17-2 GNU Internationalization utilities
ii intltool-debian 0.35.0+20060710.1 Help i18n of RFC822 compliant conf
ii libparse-debianchan 1.1.1-1 parse Debian changelogs and output
ii liburi-perl 1.35.dfsg.1-1 Manipulates and accesses URI strin
ii man-db 2.5.0-4 on-line manual pager
ii perl [libdigest-md5 5.8.8-12 Larry Wall's Practical Extraction
lintian recommends no packages.
- -- no debconf information
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFHhIgZ3DVS6DbnVgQRAmyFAKChtd7kyVES5sCTgnTnMWITQiGSqwCdEFxi
SaBlsMKithFlpSDzmhjY6IU=
=nSDu
-----END PGP SIGNATURE-----
--- manpages.orig 2008-01-09 17:22:20.000000000 +0900
+++ manpages 2008-01-09 17:36:28.000000000 +0900
@@ -305,7 +305,7 @@
# beginning of a word, but don't generate false positives on \s-1
# (small font), \*(-- (pod2man long dash), or things like \h'-1'.
if ($line =~ /^[^\.].*
- [^\w\\]
+ [\s&\(:^]
(?<! \\s | \*\( | \(- | \w\' )
(--?\w+)/ox) {
$hc++;
Reply to: