Automated ad-hoc url extracting
I'm looking for a program or some code to help extract url's from
arbitrary file types. I imagine I could write such a program using bison,
but I'd like to use an existing program to reduce the amount of research
that I would have to do to figure out what is and isn't a valid URL.
I'm also looking for something to convert relative url's to
absolute urls.
It'd also be useful to have specific parsers if I run into files
that I can tell their type. Eg, if I see an html file, I'd run it through
an html parser, same for xml, ms office documents...
Drew Daniels
Reply to: