On Monday 30 June 2008, you wrote: > Raphael Geissert <atomo64@gmail.com> writes: > > tag 424746 patch > > thanks > > > > On Wednesday 16 May 2007, Justin Pryzby wrote: > >> Package: lintian > >> Version: 1.23.28 > >> Severity: wishlist > >> > >> apt-cache dumpavail |grep -wice 'the *the' > > > > Attached is a patch adding such check. > > Thanks! > > Unless someone objects, I'm inclined to make this info-level instead of a > warning, since there are valid English constructs where this is a false > positive and it's a fairly minor bug. What kind of English constructs use duplicated words and are likely to appear on a package description? I believe there are none (but I'm always open to other opinions :) > > I think also requiring \s instead of \W on either end of the repeated > words would be safer; that way we wouldn't warn on "foo foo", and the > general rule of thumb I've been applying with description checks is that > if they're quoting it, it's probably intentional. If that's the case, please refer to attached patch (applies over the previous one). Cheers, -- Atomo64 - Raphael Please avoid sending me Word, PowerPoint or Excel attachments. See http://www.gnu.org/philosophy/no-word-attachments.html
diff --git a/checks/description b/checks/description index b06196c..0d5d550 100644 --- a/checks/description +++ b/checks/description @@ -116,7 +116,9 @@ while (<IN>) { tag "description-contains-homepage"; } - if (m,((?:\W|^)(\w+)\s+(\2)(?:\W|$)),i) { + my $wo_quotes = $_; + $wo_quotes =~ s,(\"|\').*(\1),,; + if ($wo_quotes =~ m,((?:\W|^)(\w+)\s+(\2)(?:\W|$)),i) { tag "description-contains-duplicated-word", "$1"; } diff --git a/testset/description/debian/control b/testset/description/debian/control index bf24c8c..6ce5767 100644 --- a/testset/description/debian/control +++ b/testset/description/debian/control @@ -32,6 +32,9 @@ Description: . and please avoid control statements in the long description. The line in an extended description should be less than 80 characters, otherwise you'll get a Lintian warning. + . + And the old man said "he he is the one!" + "No, I am am not", he replied Package: description-baz Architecture: all diff --git a/testset/tags.description b/testset/tags.description index 87eae2e..9ef6bf0 100644 --- a/testset/tags.description +++ b/testset/tags.description @@ -21,5 +21,5 @@ W: description-foo: description-starts-with-leading-spaces W: description-foo: possible-unindented-list-in-extended-description W: description: changelog-not-compressed-with-max-compression changelog.Debian.gz W: description: debian-changelog-file-contains-obsolete-user-emacs-settings -W: description: description-contains-duplicated-word The the +W: description: description-contains-duplicated-word The the W: description: description-synopsis-might-not-be-phrased-properly
Attachment:
signature.asc
Description: This is a digitally signed message part.