[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: remove an HTML tag and all its children from commandline



T o n g 写道:
> For not-so-simple tasks, you need not-so-simple tools. Depending on how 
> much time you'd like to investigate into such not-so-simple tools, take a 
> look at libwwww?, sgrep or the xpath language. 
>   
Sure. libwww and sgrep are tools, while xpath is a language. I believe I
should try xpath because I might use use it in other places too, but
what tool to use for xpath? Is there a handy commandline too for it? The
thing I worry a bit about xpath is: if it normalize or correct HTML
errors, or align it differently, in the output, after I have done the
removal, it would be big a problem for me, because I am a link on the
corporate workflow chain where others rely on poorly made tools and
incorrect and turbulent HTML to do their daily work and I must not break
them by improving the HTML, unless I do not want to keep current
peaceful and lazy life and save time for more valuable sane projects.

I am pretty sure sgrep can solve my problem after glanced the manual,
though.


Reply to: