[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Tools to check links



Hello,

I contribute mainly to Debian in the d-l10n-fr team. Discussing with
David Prévot in d-l10n-fr mailing list, I learned that there is a bug
in the broken link detection tool (frontend at
http://www-master.debian.org/build-logs/urlcheck/). The source code is
publicly available in a git  repository
(http://anonscm.debian.org/gitweb/?p=debwww/cron.git;a=tree;f=urlcheck;hb=HEAD).
The scripts are in python and perl.

There are several Debian packages doing the same work, so I wonder if
it would be better to use packaged software instead of repairing
Debian homemade scripts. Several softwares seem to fit the
requirements: htcheck, linklint, linkchecker and w3c-linkchecker.

Each one has pro and cons. This is a sum up based on the provided
documentation. Consider it as the quick tour, I could have missed
features or limitations:

htcheck:
	- store the stats in a MySQL database
	- parse HTML provided by HTTP (no HTTPS, FTP, etc.)
	- execute with command line
	- coded in C++

htcheck-php:
	- provide a web frontend with apache, read the MySQL database
        - coded in PHP

linklint:
	- parse HTML provided by HTTP (no HHTPS, FTC, etc.)
        - seems to support HTTP authentication
	- execute with command line
	- output as text or html, in one or several files. sum up on stdout
        - coded in Perl

linkchecker:
        - avail. protocols: HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Telnet
        - authentication user/passwd for HTTP, FTP and Telnet
	- execute with command line
        - output as text, html, csv, ...
        - coded in Python

w3c-linkchecker (the binary is checklink):
        - authentication user/passwd
        - protocol HTTP (perhaps FTP and NNTP ?)
	- execute with command line or CGI script
        - coded in Perl


David Prévot explained to me that PHP is not on the server and the
team doesn't plan to add it. So it's a no-go for htcheck-php. So there
is 2 ways:
 - developping another frontend (htcheck-python for example)
 - just considering the other tools.


What do you think about it?

Thank you


(I suscribed to the debian-www mailing so you don't need to cc me.)
-- 
Imprimez ce message en A2 et en couleur au moins 500 fois!
Brûlez des arbres!!

-- envoyé depuis ma centrale à charbon
Stéphane


Reply to: