Re: (semi) automatic check for broken links in package descriptions

On 04/26/12 03:26, Paul Wise wrote:
> On Thu, Apr 26, 2012 at 4:42 AM, Simon Kainz wrote:
>> after getting frustrated by broken links in some package descriptions, i
>> hacked up a script to check for broken links in the packages descriptions.
>> (see http://simon.familiekainz.at/dropbox/errs.html, based on wheezy/amd64)
>> for an example. Is this worth investing some work into it? I'd gladly do,
>> and (mass)file some bugs and imrove the script,  if this is of some interest
>> for QA/Debian and not being done already.
> That sounds like a useful project to add to qa.d.o and the PTS, even
> more so if it could detect parked domains or other spammy stuff.
> Can you describe how it works? Does it check only the Homepage fields
> or also links in the Description and Vcs-* fields?
Some more detail: Currently it is a mix of bash scripts, awk, and wget
--spider to check if i get something else than 200 OK bakc from the
server. I process /var/lib/apt/lists/... but i think i should use
grep-aptavail to process the fresh package files from some mirror.


