[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: LinkWalker



On Tuesday 08 January 2002 01:38, Russell Coker wrote:
> On Mon, 7 Jan 2002 23:31, Nathan Strom wrote:
> > > I have a nasty web spider with an agent name of
> > > "LinkWalker" downloading everything on my site (including
> > > .tgz files).  Does anyone know anything about it?
> >
> > It's apparantly a link-validation robot operated by a
> > company called SevenTwentyFour Incorporated, see:
> > http://www.seventwentyfour.com/tech.html
>
> Oops.
>
> Actually they sent me an offer of a free trial to their
> service (which seems quite useful).  The free trial gave me
> some useful stats and let me fix a bunch of broken links (of
> course I didn't pay).

You can do the same thing with wget:
--spider
   When invoked with this option, Wget will behave as a Web
   spider, which means that it will not download the pages, just
   check that they are there.  You can use it to check your
   bookmarks, e.g. with:

        wget --spider --force-html -i bookmarks.html

   This feature needs much more work for Wget to get close to 
   the functionality of real WWW spiders.

You'll be checking more than bookmarks but you get the idea.

Jesse



Reply to: