[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

"pre-treating" documents from certain remote URLs before a web browser renders them



I have a problem: The more frequently I browse a web site, the more I
notice all the things I hate about its web pages.

And I seem to have a partial solution to this problem: I can make XSLT
stylesheets[1] that will transform a web page A, as received from a
remote site, into a XHTML document B that better suits my purposes.[2]
I find it entertaining to make these, so I would like to figure out
how to incorporate them into a solution to the problem above.

But at the moment I do not know a good solution to the rest of the
problem, namely how to incorporate the application of stylesheets (and
preliminary preprocessing) into the web browsing activity.

I would like to launch a web browser[3], browse pages at domain X, and
know that when I go to http://X/page, or https://X/page, etc, the
browser will render not the page served from the remote site, but will
render instead that page as transformed by the appropriate stylesheet
tailored for pages from X.

Do I want a local proxy server that I can instruct to apply
appropriate transformations to documents received from certain
domains? This seems sensible, but I haven't examined too deeply what
is available along these lines, and I would rather not spend time
digging into a dead end.

I would appreciate any suggestions, experiences (bad or good) along
these lines, etc.

NOTES

1. "Stylesheet" *really* feels like the wrong term here, but I'm not
aware of a better conventional one. I'm not, as a rule, imposing style
so much as pruning document structure. I mostly just censor elements
that I consider obstacles, and rearrange the remainder so that I can
more reliably/simply locate whatever I'm looking for.

2. In order to transform page A into valid XML before applying such
transformations, I first run A through a woodchipper called tidy:
http://www.html-tidy.org/

3. And ideally an arbitrary web browser. If for some reason I have to
pick just one, lynx is a strong preference, all else equal. But, of
course, I'm interested in hearing about alternative web clients if
they provide some advantage here.


Reply to: