Re: remove an HTML tag and all its children from commandline
On Sun, 31 Jan 2010 10:54:46 +0800, Zhang Weiwu wrote:
> I want to remove all advertisements in my 100 html files. They are
> pretty neatly classed, like the following:
>
> <div class="advertisement">
> ...
> </div>
>
> However I could not simply do this:
> s/<div class="advertisement">.*</div>//
>
> Because it is too greedy
For not-so-simple tasks, you need not-so-simple tools. Depending on how
much time you'd like to investigate into such not-so-simple tools, take a
look at libwwww?, sgrep or the xpath language.
HTH
--
Tong (remove underscore(s) to reply)
http://xpt.sourceforge.net/techdocs/
http://xpt.sourceforge.net/tools/
Reply to: