Re: Bug#203498: ITP: decss -- utility for stripping CSS tags from an HTML page.
On Thursday 31 July 2003 11:27, Sam Hocevar wrote:
> And HTML makes it even harder since very few pages are valid, but
> that DeCSS utility uses only regexes anyway.
Technically, using RegExps for CSS will not only become maintenance hell, but
would also limit the usability of such a script for e.g. network
If at all, the way to go would be to use a decent HTML parser library (khtml,
gecko come to mind, even Python's htmlparser is not mature enough yet), which
not only gives the (internal, external) stylesheet but all components of the
DOM and whatnot, and use scripting facilities to modify this object, and dump
the resulting modified object to e.g. stdout.
'HTML' and 'leightweight' will hardly fit together.
Play for fun, win for freedom.
Hurd^H^H^H^HLinux-Info-Tag Dresden 2003: http://www.linux-dresden.de