Bogofilter-0.14.0.1 - New Current Release
From: David Relson (relson_at_osagesoftware.com)
Date: 07/24/03
- Next message: Frederick Noronha (FN): "GNULINUXinASIA: Thailand, Japan..."
- Previous message: Zachary Struhs: "GLLUG - August Meeting - Bletchley Park"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 24 Jul 2003 03:15:00 GMT
Bogofilter is a mail filter that classifies mail as spam or ham (non-spam)
by a statistical analysis of the message's header and content (body). The
program is able to learn from the user's classifications and corrections.
The statistical technique is known as the Bayesian technique and its use
for spam was described by Paul Graham in his article "A Plan For
Spam". Gary Robinson, in his weblog Rants, suggests some refinements for
improved discrimination between spam and ham. Bogofilter's primary
algorithm uses the f(w) parameter and the Fisher inverse chi-square
technique that he describes.
Bogofilter is run by an MDA script to classify an incoming message as spam
or ham (using wordlists stored by BerkeleyDB). Bogofilter provides
processing for plain text and html, supports multi-part mime message with
decoding of base64, quoted-printable, and uuencoded text and ignores
attachments, such as images.
Bogofilter is written in C. Supported platforms: Linux, FreeBSD, Solaris,
OS X, HP-UX, AIX, ...
******* ******* ******* ******* *******
Bogofilter version 0.14.0.1 has been released on SourceForge,
http://sourceforge.net/projects/bogofilter.
0.14.0.1 - Fixes several minor bugs in 0.14.0.
0.14.0 - Provides support for a single, combined wordlist (named
wordlist.db) which holds both spam and ham tokens and which provides
improved performance (speed). This release also supports for the Trivial
DB package (tdb) as an alternative to the BerkeleyDB.
Parsing improvements include decoding of encoded tokens in message headers
(cf. RFC-2047).
See files NEWS-0.14 and RELEASE.NOTES-0.14 for the details.
=================
BOGOFILTER NEWS
=================
0.14.0.1 2003-07-23
* Fix problem with encoded text.
* Fix handling of absolute paths.
* Fix defect in base64 decoding that can cause segfaults.
* Bogoutil now complains before exiting when it can't open a
file.
* Updated bogominitrain.pl to work with combined wordlists.
0.14.0 2003-07-22
* Initial release of code allowing bogofilter to use a single,
combined BerkeleyDB database for storing both ham and spam tokens.
The file is named wordlist.db
* Default wordlist mode is single, combined wordlist.
File wordlist.db contains all spam and ham tokens.
* Bogofilter and bogoutil detect whether one or two wordlists are in
BOGOFILTER_DIR and use the appropriate wordlist mode (combined or
separate).
* Added tdb (trivial database) support.
* Decode encoded text in header lines.
* Updated contrib/bogominitrain.pl prints more info and can save
messages used in training.
* Bogofilter's -V output now includes algorithm and database info.
* Miscellaneous documentation updates.
##########################################################################
# Send submissions for comp.os.linux.announce to: cola@stump.algebra.com #
# PLEASE remember a short description of the software and the LOCATION. #
# This group is archived at http://stump.algebra.com/~cola/ #
##########################################################################
- Next message: Frederick Noronha (FN): "GNULINUXinASIA: Thailand, Japan..."
- Previous message: Zachary Struhs: "GLLUG - August Meeting - Bletchley Park"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|