[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: remove an HTML tag and all its children from commandline



Zhang Weiwu 写道:
> Sure. libwww and sgrep are tools, while xpath is a language. I believe I
> should try xpath because I might use use it in other places too, but
> what tool to use for xpath?
Now I think I can answer my own question, partly at least. There is a
good tool for xpath that is named xpath. In debian it is in this package:
$ apt-file search /usr/bin/xpath
libxml-xpath-perl: /usr/bin/xpath

An example of using the tool: print the "advertisement" is:

$ tidy -q -asxml -utf8 page_07_zh.html | xpath -e '//div[@class="advertisement"]'


Reply to: