On 04/26/12 03:26, Paul Wise wrote: > On Thu, Apr 26, 2012 at 4:42 AM, Simon Kainz wrote: > >> after getting frustrated by broken links in some package descriptions, i >> hacked up a script to check for broken links in the packages descriptions. >> >> (see http://simon.familiekainz.at/dropbox/errs.html, based on wheezy/amd64) >> for an example. Is this worth investing some work into it? I'd gladly do, >> and (mass)file some bugs and imrove the script, if this is of some interest >> for QA/Debian and not being done already. > > That sounds like a useful project to add to qa.d.o and the PTS, even > more so if it could detect parked domains or other spammy stuff. > > Can you describe how it works? Does it check only the Homepage fields > or also links in the Description and Vcs-* fields? > Well, currently i only check the "Homepage" fields. I thought about the other links as well, but processing the Homepage entries seemed to me a first reasonable step. Processing the other URLs is surely doable. I think I will restructure my current scripts and make the whole thing more modular, to make it easy to add more locations to search for urls. Also I need to process source packages as well, which I currently don't. How would one incorporate the found data into PTS/qa.d.o website/...? Is there a documented/proper way? Or should I just do this on my own and PTS would link against my data? Regards, Simon
Attachment:
signature.asc
Description: OpenPGP digital signature