Vilistextum 2.6.5 - fault tolerant HTML to text converter
From: Patric Mueller (bhaak_at_bigfoot.com)
Date: 06/01/04
- Previous message: Werner Heuser: "TuxMobil News 05/2004"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 1 Jun 2004 10:50:00 GMT
Announcing the version 2.6.5 release of Vilistextum.
Vilistextum is a small and fast HTML to text converter.
It is quite fault-tolerant and deals well with badly-formed HTML.
It has full support for different character sets (e.g. Unicode).
Some features:
==============
* understands HTML 3.2 upto 4.01 and XHTML 1.0
* supports various multibyte encodings (Unicode, Shift_JIS, ...)
* output can be optimized for ebook reading
* converts characters and entities between 128 and 159 from the
windows1252 charset to meaningful strings in ISO-8859-1.
* GUI-frontend using kaptain
* creates footnotes for links
Changes:
========
* vilistextum recognizes character set, even if it is declared with
<META http-equiv="charset" content="utf-8">
* BUGFIX: --no-title in combination with --shrink-lines didn't work
* BUGFIX: ignore html tags inside script environments
* BUGFIX: sometimes the last word in the document was not output
Download:
=========
http://homepage.sunrise.ch/mysunrise/bhaak/vilistextum/vilistextum-2.6.5.tar.gz
http://homepage.sunrise.ch/mysunrise/bhaak/vilistextum/vilistextum-2.6.5.tar.bz2
Homepage:
=========
http://homepage.sunrise.ch/mysunrise/bhaak/vilistextum/
##########################################################################
# Send submissions for comp.os.linux.announce to: cola@stump.algebra.com #
# PLEASE remember a short description of the software and the LOCATION. #
# This group is archived at http://stump.algebra.com/~cola/ #
##########################################################################
- Previous message: Werner Heuser: "TuxMobil News 05/2004"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|