[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: remove an HTML tag and all its children from commandline

On Sun, 31 Jan 2010 10:54:46 +0800, Zhang Weiwu wrote:

> I want to remove all advertisements in my 100 html files. They are
> pretty neatly classed, like the following:
> <div class="advertisement">
> ...
> </div>
> However I could not simply do this:
> s/<div class="advertisement">.*</div>//
> Because it is too greedy

For not-so-simple tasks, you need not-so-simple tools. Depending on how 
much time you'd like to investigate into such not-so-simple tools, take a 
look at libwwww?, sgrep or the xpath language. 


Tong (remove underscore(s) to reply)

Reply to: