Re: wget asp pages recursively?
Matthew Weier O'Phinney helpfully answered my question about how one could
download the pages in an asp application with:
> wget -m -L -t 5 -w 5 http://someplace.com/some.asp&page=1
This does indeed have the desired effect. Unfortunately, it yields an
embarrassment of riches - I seem to be getting all the pages on the site, which
is quite large. While I could just download the whole site, I'm only on a 56k
modem. I actually may not have enough disk space on my box to store all the
files I might download.
Is there some way I can limit the URL's wget will follow? Suppose all of my
pages have a URL like:
http://someplace.com/some.asp&ThisIsFixed=1&page=1
I would like to only download URLs that have "ThisIsFixed=1" in them.
What would be really cool is if one could get wget to test a URL with a regular
expression before downloading it, but I don't see a way to do that. (I have
been studying the man page.)
I could hack on wget's source code if necessary. If I have to do that, maybe
somebody could give me a tip on where to look in the source. Maybe I could
contribute a useful patch.
Thanks,
Mike
--
Michael D. Crawford
GoingWare Inc. - Expert Software Development and Consulting
http://www.goingware.com/
crawford@goingware.com
Tilting at Windmills for a Better Tomorrow.
Reply to: