[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#685563: www.debian.org: Comments in DPN wml sources appear in RSS feed



Package: www.debian.org
Severity: normal
User: www.debian.org@packages.debian.org
Usertags: script news

Dear Webmasters,
tonight I noticed that in the RSS feed of the new DPN (2012/16) there were some
comments, so I started looking at the dwn-to-rdf.pl script.
I am not a Perl expert, but I came to a little workaround trying to make the
script ignore the lines that start with a '#' character.
I did not test the patch thoroughly, but a run with the same index.wml
with the comments gave a dwn.en.rdf that seemed clean to me.
As suggested in #debian-www by taffit, I removed the comments in the DPN source
file and now I'm sending here the patch as a bug report.
I'm almost sure that this is not the best solution, but maybe it's a
start.

Best regards,
Mark


-- System Information:
Debian Release: wheezy/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: i386 (i686)

Kernel: Linux 3.2.0-3-686-pae (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

-- 
. ''`.  | GPG Public Key  : 0xCD542422 - Download it from http://is.gd/fOa7Vm
: :'  : | GPG Fingerprint : 0823 A40D F31B 67A8 5621 AD32 E293 A2EB CD54 2422
`. `'`  | Powered by Debian GNU/Linux, http://www.debian.org
  `-    | Try not. Do, or do not. There is no try. - Master Yoda, TESB.
Index: dwn-to-rdf.pl
===================================================================
RCS file: /cvs/webwml/webwml/english/News/weekly/dwn-to-rdf.pl,v
retrieving revision 1.19
diff -u -u -r1.19 dwn-to-rdf.pl
--- dwn-to-rdf.pl	16 Apr 2011 23:50:00 -0000	1.19
+++ dwn-to-rdf.pl	21 Aug 2012 21:56:04 -0000
@@ -168,7 +168,9 @@
     while (<F>) {
 	# prevent double utf-8 encode by XML::RSS 
 	$_ = decode_utf8($_) if ($charset eq 'utf-8') ;
-	if (/^<p><strong>(.*)<\/strong>(?:<br \/>)?\s*(.*)/) {
+    if (/^#.*$/) {
+    }
+	elsif (/^<p><strong>(.*)<\/strong>(?:<br \/>)?\s*(.*)/) {
 	    $headline = $1;
 	    $body = $2."\n";
 	    chop ($headline) if ($headline =~ /\.$/);

Attachment: signature.asc
Description: Digital signature


Reply to: