[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: remove an HTML tag and all its children from commandline



On Sun, 31 Jan 2010 20:05:46 +0800, Zhang Weiwu wrote:

> $ tidy -q -asxml -utf8 page_07_zh.html | xpath -e
> '//div[@class="advertisement"]'

exactly. Glad that you found both tidy & libxml-xpath-perl, and solve the 
problem yourself.

-- 
Tong (remove underscore(s) to reply)
  http://xpt.sourceforge.net/techdocs/
  http://xpt.sourceforge.net/tools/


Reply to: