Re: batch browsing

To: Tony Godshall <togo@of.net>
Cc: debian-laptop@lists.debian.org, Gunda Fischer <gunda.fischer@web.de>
Subject: Re: batch browsing
From: ingo@cimdata.de
Date: Thu, 05 Jul 2001 14:18:28 +0200
Message-id: <3B445B14.63EB6683@cimdata.de>
References: <20010703175855.A24125@dude.of.net> <3B42C891.A436DD3@cimdata.de> <20010704121234.B8226@dude.of.net>

Tony Godshall wrote:
> 
> [ksieben]
> > I woud use wget:
> > wget -r -k -H -l X -nc http://google-search-results
> > wher X is the level you like to (travers?) the links
> 
> Yeah, that works for the first go, but how to get the
> subsequent pages (just the ones that are interesting)?

I see my first answer are not working, its works fine to get from a
starting page the links, 
but google (better 7metasearch.com) didnt have links to the interesting
pages,
they link to a script with redirect to the page, that meens wget wont
work. (You noticed)
Ok I woudent answer if I didnt have a nother solution. But it is quit
some work.
Best: make a directory I called it bike.
Type your searchwords into the page mybee: >bike + office<
safe the result into the fresh created dir: result.asp
and now take my script (atchement)  witch automaed the following steps:
extracting the (interesting) urls out of the file and get them witch
wget.
The wget param are a little differnd: wget -r -k -l X -nc (didnt need
the -H)

first I try to make it more confi:
 if you examin the script you see that I was first up to get the search
words in the script
and ask with netcat for the page/parse them/and them get the interesting
ones witch wget.
but didnt get it to work.

a feautre:  :)
at some circumenses the google answer has an other page layout then ther
are no >[< and >]< 
around the url, this meens that the regularExpresion coudnt find the
Url, and wget get nothing to do.

Actualy it seams that wwwoffle works fine .... 

how you get the bike back? :)

-- 
ingo dross infomation/security architecture
[]___¸
######-\
O_-_-_O-\

Attachment: gosearch.sh
Description: Bourne shell script

Reply to:

References:
- batch browsing
  - From: Tony Godshall <togo@of.net>
- Re: batch browsing
  - From: ingo@cimdata.de
- Re: batch browsing
  - From: Tony Godshall <togo@of.net>

Prev by Date: Re: Resume fails on Tosh-4080XCDT
Next by Date: Re: Resume fails on Tosh-4080XCDT
Previous by thread: Re: batch browsing
Next by thread: Re: batch browsing
Index(es):
- Date
- Thread