Bogofilter-0.14.4 - New Current Release

From: David Relson (relson_at_osagesoftware.com)
Date: 08/10/03


Date: 10 Aug 2003 14:00:06 GMT

Bogofilter is a mail filter that classifies mail as spam or ham (non-spam)
by a statistical analysis of the message's header and content (body). The
program is able to learn from the user's classifications and corrections.

The statistical technique is known as the Bayesian technique and its use
for spam was described by Paul Graham in his article "A Plan For
Spam". Gary Robinson, in his weblog Rants, suggests some refinements for
improved discrimination between spam and ham. Bogofilter's primary
algorithm uses the f(w) parameter and the Fisher inverse chi-square
technique that he describes.

Bogofilter is run by an MDA script to classify an incoming message as spam
or ham (using wordlists stored by BerkeleyDB). Bogofilter provides
processing for plain text and html, supports multi-part mime message with
decoding of base64, quoted-printable, and uuencoded text and ignores
attachments, such as images.

Bogofilter is written in C. Supported platforms: Linux, FreeBSD, Solaris,
OS X, HP-UX, AIX, ...

******* ******* ******* ******* *******

Bogofilter version 0.14.42 has been released on SourceForge,
http://sourceforge.net/projects/bogofilter.

Major cleanup/revision to datastore/database code layers.
Database update and locking fixes.
Exit code cleanup.
Documentation updates.

See files NEWS-0.14 and RELEASE.NOTES-0.14 for the details.

                               =================
                                BOGOFILTER NEWS
                               =================

0.14.4 2003-08-10

* Revised database API so that there are 3 distinct layers
   (program, datastore, and database) with a clean interface
   between them.
* Correct exitcodes in bogoutil by using EX_ERROR.
* Updated FAQ.
* Fixed token registration bug in 0.14.x versions.
* Fixed seg fault caused by database lock contention.

0.14.3 2003-08-05

* Fixed critical locking bug introduced into bogofilter 0.14.0 with
   the combined-wordlist code: when working with separate wordlists,
   bogofilter would lock only the first one opened, rather than all.
* Documentation updates.
* %g formatting is now supported by bogofilter's formatting functions.
* Merged trio 1.10 (http://ctrio.sourceforge.net/) to support
   compilation on ancient systems (Solaris 2.5) that do not have
   [v]snprintf functions.
   Trio is Copyright (C) 1998-2000 Bjorn Reese and Daniel Stenberg.
* Various documentation updates, including the FAQ.
* The test suite was adjusted for older grep variants (Solaris 2.5)
   that don't cope with long lines.
* Print database version in print_version().
* Postfix integration instructions have been upgraded.
* Debug output for wordlists and databases was enhanced.

0.14.2 2003-08-02

* Replaced use of memcpy() by memmove() in an input routine. The
   overlapping copy migh cause data corruption on some systems.
* Fixed "make check" failures for bogoutil introduced with the
   "combined wordlist" feature in 0.14.0. There has been a buffer
   overflow. All users of bogofilter with combined wordlist prior to
   0.14.2 are advised to upgrade.
* Fixed bogus "t.valgrind" test FAILures.
* Fixed uninitialized data in db_get_dbvalue(), for split word lists.
* New file, contrib/vm-bogofilter.el, provides an interface
   between the VM mail reader and bogofilter."
* Revised lexer_v3.l for compatibility with flex-2.5.31
* Break up long line in regression test input for Solaris 2.5
   compatibility.

0.14.1.1 2003-08-01

* Fixed check for adding spam_subject_tag to Subject: line.
* Correct problem with t.degen regression test.
* Updated French version of FAQ.

0.14.1 2003-07-31

* Implemented named exitcodes, with Unsure having its own value (2)
   and changing the value for error from 2 to 3.
* Initial release of token degeneration code.
* Revised lexer pattern to better recognize encoded tokens.
* Updated English version of FAQ.

0.14.0.1 2003-07-23

* Fix problem with encoded text.
* Fix handling of absolute paths.
* Fix defect in base64 decoding that can cause segfaults.
* Bogoutil now complains before exiting when it can't open a
   file.
* Updated bogominitrain.pl to work with combined wordlists.

0.14.0 2003-07-22

* Initial release of code allowing bogofilter to use a single,
   combined BerkeleyDB database for storing both ham and spam tokens.
   The file is named wordlist.db
* Default wordlist mode is single, combined wordlist.
   File wordlist.db contains all spam and ham tokens.
* Bogofilter and bogoutil detect whether one or two wordlists are in
   BOGOFILTER_DIR and use the appropriate wordlist mode (combined or
   separate).
* Added tdb (trivial database) support.

* Decode encoded text in header lines.

* Updated contrib/bogominitrain.pl prints more info and can save
   messages used in training.

* Bogofilter's -V output now includes algorithm and database info.
* Miscellaneous documentation updates.

##########################################################################
# Send submissions for comp.os.linux.announce to: cola@stump.algebra.com #
# PLEASE remember a short description of the software and the LOCATION. #
# This group is archived at http://stump.algebra.com/~cola/ #
##########################################################################



Relevant Pages

  • bogofilter-0.92.6 - New Stable Release
    ... for spam was described by Paul Graham in his article "A Plan For Spam". ... Bogofilter is run by an MDA script to classify an incoming message as ... spam or ham (using wordlists stored by BerkeleyDB). ... * Revise enums and regression tests for new format characters ...
    (comp.os.linux.announce)
  • Bogofilter-0.92.4 - new stable release
    ... for spam was described by Paul Graham in his article "A Plan For Spam". ... Bogofilter is run by an MDA script to classify an incoming message as ... spam or ham (using wordlists stored by BerkeleyDB). ... * Revise enums and regression tests for new format characters ...
    (comp.os.linux.announce)
  • Bogofilter-0.92.0 - New Stable Release
    ... for spam was described by Paul Graham in his article "A Plan For Spam". ... Bogofilter is run by an MDA script to classify an incoming message as ... spam or ham (using wordlists stored by BerkeleyDB). ... have been a few documentation update and a minor bug fix since the ...
    (comp.os.linux.announce)
  • bogofilter-0.14.5.2 - New Stable Release
    ... Bogofilter is a mail filter that classifies mail as spam or ham by a statistical analysis of the message's header and content. ... Revised database API so that there are 3 distinct layers ... Debug output for wordlists and databases was enhanced. ...
    (comp.os.linux.announce)
  • Bogofilter-0.91.2 - new current release
    ... for spam was described by Paul Graham in his article "A Plan For Spam". ... Bogofilter is run by an MDA script to classify an incoming message as ... spam or ham (using wordlists stored by BerkeleyDB). ... Modify regression tests to use bogoutil to create empty ...
    (comp.os.linux.announce)