Re: remove an HTML tag and all its children from commandline



Zhang Weiwu 写道:
Sure. libwww and sgrep are tools, while xpath is a language. I believe I
should try xpath because I might use use it in other places too, but
what tool to use for xpath?
Now I think I can answer my own question, partly at least. There is a
good tool for xpath that is named xpath. In debian it is in this package:
$ apt-file search /usr/bin/xpath
libxml-xpath-perl: /usr/bin/xpath

An example of using the tool: print the "advertisement" is:

$ tidy -q -asxml -utf8 page_07_zh.html | xpath -e '//div[@class="advertisement"]'


--
To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx
with a subject of "unsubscribe". Trouble? Contact listmaster@xxxxxxxxxxxxxxxx