bogofilter-0.15.5 - New Current Release

From: David Relson (relson_at_osagesoftware.com)
Date: 09/30/03

  • Next message: James Antill: "Vstr 1.0.9 (string library in C)"
    Date: Mon, 29 Sep 2003 20:54:47 CST
    
    

    Bogofilter is a mail filter that classifies mail as spam or ham
    (non-spam) by a statistical analysis of the message's header and content
    (body). The program is able to learn from the user's classifications
    and corrections.

    The statistical technique is known as the Bayesian technique and its use

    for spam was described by Paul Graham in his article "A Plan For
    Spam". Gary Robinson, in his weblog Rants, suggests some refinements
    for improved discrimination between spam and ham. Bogofilter's primary
    algorithm uses the f(w) parameter and the Fisher inverse chi-square
    technique that he describes.

    Bogofilter is run by an MDA script to classify an incoming message as
    spam or ham (using wordlists stored by BerkeleyDB). Bogofilter provides
    processing for plain text and html, supports multi-part mime message
    with decoding of base64, quoted-printable, and uuencoded text and
    ignores attachments, such as images.

    Bogofilter is written in C. Supported platforms: Linux, FreeBSD,
    Solaris, OS X, HP-UX, AIX, RISC-OS, OS/2, ...

    ******* ******* ******* ******* *******

    Bogofilter-0.15.5 is available on SourceForge as the Current Release.
    The newly expanded tagging of header line tokens continues the effort to
    improve the quality of bogofilter's ham and spam classifications. It is
    also very important that you read the release notes (file
    RELEASE.NOTES-0.15) and then either rebuild your wordlist(s) or start
    using bogofilter's "-H" (header-degen) option.

                                   =================
                                    BOGOFILTER NEWS
                                   =================

    0.15.5 2003-09-29

    * Added '-H' (header-degen) option to aid transition to new
      parsing. See RELEASE.NOTES-0.15 for more info.

    * GNU GSL 1.4 has replaced DCDFLIB.
    * VERPs (Variable Envelope Return Paths) now have their sequence
      numbers replaced by a '#' for scoring.
    * Fixed problem that caused auto-update ("-u") to not update
      separate wordlists.
    * Fixed processing of rmail files.
    * Transaction code added for wordlist maintenance.
    * End-of-header code revised to ensure that passthrough ("-p")
      properly places the X-Bogosity line.
    * Fixed logging behavior when scoring mailboxes, maildirs, etc.
    * Timestamp code refactored and moved from maint.c to datastore.c
    * Added support for OS/2's file system.
    * Minor revisions of RISC-OS compatibility code.

    0.15.4 2003-09-20

    * Additional header line tagging as suggested by Michael O'Reilly.
    * No longer discarding message separators.
    * Revise parsing pattern for "encoded text" and regression
      test for folded text.
    * Added BOGOTEST environment variable to enable flex debugging.
    * Report if database file permissions wrong.
    * No longer including pid in syslog error messages.
    * Fixed bogoutil problem with '-w' and '-p'.
    * Use GSL (the Gnu Scientific Library) when it's available.
    * Minor revision of bogotune.
    * Minor revision of bogominitrain.pl

    0.15.3 2003-09-10

    * Fix auto-update ('-u') bug that double registers ham and spam.
    * Revised parsing to discard additional headers, i.e.
      Resent-Message-ID, In-Reply-To, and References.
    * Fixed maintenance mode (broken during database API rewrite).
    * Added regression test for maintenance mode.
    * Re-organized test framework to put all scripts in src/tests,
      all input files in src/tests/inputs, and reference outputs
      in src/tests/outputs.
    * Correct QDBM optimization problems arising from API change.

    0.15.2 2003-09-07

    * Header line unfolding now handled by flex rules.
      Special thanks to Michael O'Reilly for his help!
    * Fatal flex errors are now caught and bogofilter exits
      gracefully after closing its database(s).

    * Initial release of RISC-OS support, including qdbm and tdb.
    * QDBM is now supported.
    * The data base configuration has changed. --with-tdb is gone,
      use --with-database=db, --with-database=tdb or
      --with-database=qdbm instead.

    * Updated bogowordfreq to work with bogoreader.

    0.15.1 2003-09-03

    * Check for xmlto during configuration.
    * Fix problem in empty line parsing rule.
    * Fix string termination problem for bulk mode paths.
    * Limit size of unfolded header lines.

    * Allow -I to be used with file or directory.
    * Revise flex rule for encoded text to reduce program size.

    * Revise flex grammar:
      - to reduce size of generated rules
      - to simplify handling of header tags and mime parts

    * Clean-up message header processing:
      - Don't tokenize message separator lines.
      - Merge whitespace separated encoded words.
      - Unfold header lines.

    0.15.0 2003-08-30

    * Implemented a new, more robust, mail reading module that knows how
      to split a mbox into messages and read Maildirs.
    * Implement support for MH directory (such as used by Sylpheed).

    * Change mime boundary line to operate on raw input,
      i.e. before decoding it.
    * Revise mime processing to cure "fatal flex scanner internal
      error--end of buffer missed".
    * Restore parsing rule for ending a "loose" html comment.
    * Add charset map for windows-1251 to KOI8-R (Cyrillic).

    * 64-bit printf files for %*s string formatting.

    0.14.5.4 2003-08-30 - Current Stable Release

    ##########################################################################
    # Send submissions for comp.os.linux.announce to: cola@stump.algebra.com #
    # PLEASE remember a short description of the software and the LOCATION. #
    # This group is archived at http://stump.algebra.com/~cola/ #
    ##########################################################################


  • Next message: James Antill: "Vstr 1.0.9 (string library in C)"