On Monday 30 June 2008, you wrote: > Raphael Geissert <atomo64@gmail.com> writes: > > tag 424746 patch > > thanks > > > > On Wednesday 16 May 2007, Justin Pryzby wrote: > >> Package: lintian > >> Version: 1.23.28 > >> Severity: wishlist > >> > >> apt-cache dumpavail |grep -wice 'the *the' > > > > Attached is a patch adding such check. > > Thanks! > > Unless someone objects, I'm inclined to make this info-level instead of a > warning, since there are valid English constructs where this is a false > positive and it's a fairly minor bug. What kind of English constructs use duplicated words and are likely to appear on a package description? I believe there are none (but I'm always open to other opinions :) > > I think also requiring \s instead of \W on either end of the repeated > words would be safer; that way we wouldn't warn on "foo foo", and the > general rule of thumb I've been applying with description checks is that > if they're quoting it, it's probably intentional. If that's the case, please refer to attached patch (applies over the previous one). Cheers, -- Atomo64 - Raphael Please avoid sending me Word, PowerPoint or Excel attachments. See http://www.gnu.org/philosophy/no-word-attachments.html
diff --git a/checks/description b/checks/description
index b06196c..0d5d550 100644
--- a/checks/description
+++ b/checks/description
@@ -116,7 +116,9 @@ while (<IN>) {
tag "description-contains-homepage";
}
- if (m,((?:\W|^)(\w+)\s+(\2)(?:\W|$)),i) {
+ my $wo_quotes = $_;
+ $wo_quotes =~ s,(\"|\').*(\1),,;
+ if ($wo_quotes =~ m,((?:\W|^)(\w+)\s+(\2)(?:\W|$)),i) {
tag "description-contains-duplicated-word", "$1";
}
diff --git a/testset/description/debian/control b/testset/description/debian/control
index bf24c8c..6ce5767 100644
--- a/testset/description/debian/control
+++ b/testset/description/debian/control
@@ -32,6 +32,9 @@ Description:
. and please avoid control statements in the long description.
The line in an extended description should be less than 80 characters, otherwise you'll get
a Lintian warning.
+ .
+ And the old man said "he he is the one!"
+ "No, I am am not", he replied
Package: description-baz
Architecture: all
diff --git a/testset/tags.description b/testset/tags.description
index 87eae2e..9ef6bf0 100644
--- a/testset/tags.description
+++ b/testset/tags.description
@@ -21,5 +21,5 @@ W: description-foo: description-starts-with-leading-spaces
W: description-foo: possible-unindented-list-in-extended-description
W: description: changelog-not-compressed-with-max-compression changelog.Debian.gz
W: description: debian-changelog-file-contains-obsolete-user-emacs-settings
-W: description: description-contains-duplicated-word The the
+W: description: description-contains-duplicated-word The the
W: description: description-synopsis-might-not-be-phrased-properly
Attachment:
signature.asc
Description: This is a digitally signed message part.