Re: remove an HTML tag and all its children from commandline
- From: Zhang Weiwu <zhangweiwu@xxxxxxxxxx>
- Date: Sun, 31 Jan 2010 20:05:46 +0800
Zhang Weiwu 写道:
Sure. libwww and sgrep are tools, while xpath is a language. I believe INow I think I can answer my own question, partly at least. There is a
should try xpath because I might use use it in other places too, but
what tool to use for xpath?
good tool for xpath that is named xpath. In debian it is in this package:
$ apt-file search /usr/bin/xpath
libxml-xpath-perl: /usr/bin/xpath
An example of using the tool: print the "advertisement" is:
$ tidy -q -asxml -utf8 page_07_zh.html | xpath -e '//div[@class="advertisement"]'
--
To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx
with a subject of "unsubscribe". Trouble? Contact listmaster@xxxxxxxxxxxxxxxx
- Follow-Ups:
- References:
- remove an HTML tag and all its children from commandline
- From: Zhang Weiwu
- Re: remove an HTML tag and all its children from commandline
- From: T o n g
- Re: remove an HTML tag and all its children from commandline
- From: Zhang Weiwu
- remove an HTML tag and all its children from commandline
- Prev by Date: Re: Realtek r8168 problems with net installer and Intel D945GSEJT motherboard
- Next by Date: Re: one website gives "address not found" from LAN
- Previous by thread: Re: remove an HTML tag and all its children from commandline
- Next by thread: Re: remove an HTML tag and all its children from commandline
- Index(es):