[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Convert HTML document to use relative links ?



2008/9/3 Andre Majorel <aym-naibed@teaser.fr>:
> Is there is program to make all links relative in HTML documents
> saved in wget -x fashion ? (http://foo.com/a/b.html saved as
> ./foo.com/a/b.html.)
>
> For example,
>
> - if ./foo.com/a/b.html contains <img src="/images/d.jpg">
>  and                             ./foo.com/images/d.jpg
>  exists, replace that tag     <img src="../images/d.jpg">
>
> - if ./foo.com/a/b.html contains <a href="http://bar.org/c.html";>
>  and                                          ./bar.org/c.html
>  exists, replace that tag by     <a href="../../bar.org/c.html">
>
> I know about wget -k and it doesn't do what I need. My goal is use
> wget or some such to have an exact mirror of the web site and then
> make a _copy_ of the mirror that can be navigated off-line.

One way to do this which would save downloading twice might be
something like this:

1) wget from foo.com to bar.local as exact mirror
2) apache virtual host for the exact mirror as foo.com
3) temporary hosts line/dns entry either on bar.local or your
workstation aliasing foo.com to bar.local
4) wget -k foo.com would pull from local exact copy as a local relative mirror.

cheers,
Owen.


Reply to: