Re: scripting lynx
On Wed, Aug 01, 2001 at 06:08:29PM +0200, Russell Coker wrote:
> I tried both of them with no difference.
interesing. either should have worked.
> > or use the LWP modules to make yourself a web-bot.
> I may have to do that. Thanks for the suggestions.
you may need to set the Referer: header in the HTTP request. some cgi
scripts check the referer...(yes, that's pointless and stupid, but it's
and set the user agent to something like:
$ua->agent('Mozilla/4.51 (Macintosh; I; PPC)');
i generally use netscape on mac as my user-agent in web robots because:
a) moronic sites generally don't block netscape on macintosh
(i have seen some sites that block netscape on linux with a stupid
message like "sorry, we don't support your browser/operating-system".
unfortunately, brain-dead web design is not yet a capital crime)
b) said moronic sites generally wont output moronic IE-specific junk
if they detect netscape. sometimes. if you're lucky.
btw, the perl HTML::TokeParser module is excellent for extracting stuff
from web pages. i used this (plus LWP::UserAgent, HTTP::Cookies, and
HTTP::Request) to write a wrapper script for searching the Melbourne
Trading Post site, which is one of the most brain-dead cretinous sites
i've ever had the misfortune of having to use.
there's also HTML::TableExtract for getting data out of html tables.
these modules are all packaged for debian.
craig sanders <firstname.lastname@example.org>
Fabricati Diem, PVNC.
-- motto of the Ankh-Morpork City Watch