Re: lynx and google.com
I think my previous reply barfed and didn't go, apologies if this
doubles.
On Thu, May 06, 2004 at 02:14:56PM -0400, David P James wrote:
> Maybe they caught on to your little trick and put an end to it? :) Or at
> least tried to, since you seem to be spoofing the UA.
I'm sure they have anti-screen-scraping technologies, but Occam's Razor
suggests they aren't responding to me in particular, maybe they're
upgrading Apache? Does lynx use HTTP/1.0? and links and wget and moz
use HTTP/1.1?
> > -e 's/href=\//href=http:\/\/google.com\//g' \
> ^
> Does this line actually work? To me it looks like you're missing an
> escape before the second '/' before the second 'href'.
Yes, it actually works. The unescaped / are delimiters for the s///.
It's trashy write-only code -- but it works.
I wish Moz AdBlock extension would get aggressive about putting
HTML-morphing technology in there to properly extract real content from
pages filled with text ads. Mine is a grotesque hack, shh, don't tell
Google, they don't want to be Evil!
Reply to: