[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Automated ad-hoc url extracting



Drew Scott Daniels wrote:
>         I'm looking for a program or some code to help extract url's from
> arbitrary file types. I imagine I could write such a program using bison,
> but I'd like to use an existing program to reduce the amount of research
> that I would have to do to figure out what is and isn't a valid URL.

urlview extracts url's from plain text files, and then starts an interactive
menu to choose which one you want to view.  It uses a single regexp to match
them to begin with, which can be found in /etc/urlview/system.urlview, and
looks like this (sorry about the >80-character line):

(((http|https|ftp|gopher)|mailto):(//)?[^ <>"\t]*|www\.[-a-z0-9.]+)[^ .,;\t<">\):]

This might be a good starting point, even if you don't want to use the whole
of urlview.

Glyn

-- 
And the face on every coin engraved
The anarchists are all enslaved
My own flag is forever waved
By the grateful people I have saved.



Reply to: