bogofilter 0.15.10 available

From: David Relson (relson_at_osagesoftware.com)
Date: 12/09/03

  • Next message: Frederick Noronha (FN): "OpenSource TelephonyConference (fwd)"
    Date: Mon, 8 Dec 2003 20:30:14 CST
    
    

    Bogofilter is a mail filter that classifies mail as spam or ham
    (non-spam) by a statistical analysis of the message's header and content
    (body). The program is able to learn from the user's classifications
    and corrections.

    The statistical technique is known as the Bayesian technique and its use

    for spam was described by Paul Graham in his article "A Plan For
    Spam". Gary Robinson, in his weblog Rants, suggests some refinements
    for improved discrimination between spam and ham. Bogofilter's primary
    algorithm uses the f(w) parameter and the Fisher inverse chi-square
    technique that he describes.

    Bogofilter is run by an MDA script to classify an incoming message as
    spam or ham (using wordlists stored by BerkeleyDB). Bogofilter provides
    processing for plain text and html, supports multi-part mime message
    with decoding of base64, quoted-printable, and uuencoded text and
    ignores attachments, such as images.

    Bogofilter is written in C. Supported platforms: Linux, FreeBSD,
    Solaris, OS X, HP-UX, AIX, RISC-OS, OS/2, ...

    ******* ******* ******* ******* *******

    Bogofilter 0.15.10 is available on SourceForge. The download URL is
    http://sourceforge.net/project/showfiles.php?group_id=62265

    This release has bogotune bug fixes and enhancements, configure fixes
    for statically linked executables, portability fixes for DGUX and HPUX,
    and the usual variety of other minor changes.

    If upgrading to 0.15.0 from a version older than 0.15.4, you use rebuild
    your token database to take advantage of header line tagging which was
    expanded in 0.15.4

                                   =================
                                    BOGOFILTER NEWS
                                   =================

    Be sure to read RELEASE.NOTES-0.15, with attention to the section
    about retraining and the '-H' flag...

    0.15.10 2003-12-08

    * Multiple fixes, revisions, and changes to bogotune.
    * Added -M option to bogotune for creating message count files.
    * Fixed bug in header degeneration.
    * Added degeneration options to config file.
    * Added subject line tagging for Unsures.
    * Formatting and portability fixes for DGUX.
    * Fixed "configure --enable-static" for building statically linked
      executables.
    * The test suite now uses static executables when "configure
      --enable-static" has been used.
    * The test suite no longer depends on procmail for t.MH and t.maildir
      or formail for t.bulkmode.
    * Moved robx calculation code to new file for sharing by bogoutil and
      bogotune.
    * Fix segfault when using '-H' (header_degen) option.

    0.15.9 2003-11-23

    * Configure now finds a POSIX compliant shell for running version.sh
    * Remove --disable-* options for algorithms. Has never been supported
      well and serves no useful purpose, the algorithm code is irrelevant
      compared to lexer or other stuff.
    * Fixed a memory leak in bogoutil.
    * Cleaned up help message in bogoutil.
    * Bogotune now checks for incorrectly classified messages in the test
      data and exits if so.
    * Bogotune's memory needs lessened.
    * Fixed timestamp config option.
    * Exclude apostrophes and backticks at the end of a token.
    * Lexer changes reduce size of bogofilter executable by approx 90%.
    * Lexer.c no longer discards X-Bogosity lines in rfc822 attachments.
    * Removed repetition counts in lexer for TOKEN and MIME_BOUNDARY
      patterns to reduce executable size.
    * "<!DOCTYPE HTML PUBLIC...>" is now recognized as starting html text.
    * Several minor lexer bugs fixed.
    * Updated bogominitrain.pl to v1.4.2
    * TDB passes all checks again.
    * QDBM support fixed.
    * Minor documentation fixes.
    * Minor error message cleanups.
    * Refactored passthrough.c.
    * Test suite bugfixes for TDB/QDBM.
    * BerkeleyDB support warns if data base size approaces file size
    resource limit.

    0.15.8 2003-10-29

    * Modified handling of mime attachments to decode rfc822 and to
      ignore applications and images.
    * Added decoding of percent escaped characters in URLs.
    * Script tuning/bogotune rewritten as C program.
    * Added man page for bogotune.
    * Print "X-Bogosity" line when "-t" is used alone.
    * Change bogoupgrade back to using 2 arg open for perl-5.6
      compatibility.
    * Initialize wordhash storage.
    * Fix initialization problem that prevents reading more than
      one msg-count file.
    * Configure script modified to better detect BerkeleyDB libs.
    * Makefile modified to build bogolexer and bogoutil with fewer
      shared libs.
    * Fix build problem in doc directory.
    * English and french versions of bogofilter-faq.html revised.

    0.15.7 2003-10-13 - Stable Release

    * Added decoding of escaped characters in html.
    * Disable header line tagging when processing msg-count files.
    * Revised mailbox processing so type recognition is now table driven.
    * Include all tokens in bogoutil dump output (unless in maintenance
      mode).
    * Added support for ANT mailboxes.
    * Made portability changes for OS/2 and RISC-OS

    0.15.6 2003-10-02

    * Fixed problem in bogoupgrade.
    * Don't allow whitespace in SMTP and ESMTP tokens.
    * Revised reference ouputs for SMTP/ESMTP change.
    * Revised configure.ac to remove GLIBC-2.3 dependency and include info
      for DOS-ish file in config.h.
    * Bogofilter can now use GSL 1.0 to 1.3 as well as 1.4.
      If your distribution splits GSL into a library and a developer
      package (Mandrake and Debian Linux), remember to install both!
    * Rebuild i586.rpm with GSL dynamically linked; source rpms
      and bogofilter-static rpm not affected.

    0.15.5.2 2003-10-01

    * Rebuild of i586.rpm with GSL-1.4 dynamically linked.
      Source rpms and bogofilter-static rpm not affected.

    0.15.5.1 2003-09-30

    * Added GSL-1.4 as requirement for binary rpm.
    * Fixed up t.separate reference test and cleaned up t.degen,
      t.split, and t.regtest.
    * Man page and French FAQ revised.

    0.15.5 2003-09-29

    * Added '-H' (header-degen) option to aid transition to new
      parsing. See RELEASE.NOTES-0.15 for more info.

    * GNU GSL 1.4 has replaced DCDFLIB.
    * VERPs (Variable Envelope Return Paths) now have their sequence
      numbers replaced by a '#' for scoring.
    * Fixed problem that caused auto-update ("-u") to not update
      separate wordlists.
    * Fixed processing of rmail files.
    * Transaction code added for wordlist maintenance.
    * End-of-header code revised to ensure that passthrough ("-p")
      properly places the X-Bogosity line.
    * Fixed logging behavior when scoring mailboxes, maildirs, etc.
    * Timestamp code refactored and moved from maint.c to datastore.c
    * Added support for OS/2's file system.
    * Minor revisions of RISC-OS compatibility code.

    0.15.4 2003-09-20

    * Additional header line tagging as suggested by Michael O'Reilly.
    * No longer ignoring message separators.
    * Revise parsing pattern for "encoded text" and regression
      test for folded text.
    * Added BOGOTEST environment variable to enable flex debugging.
    * Report if database file permissions wrong.
    * No longer including pid in syslog error messages.
    * Fixed bogoutil problem with '-w' and '-p'.
    * Use GSL (the Gnu Scientific Library) when it's available.
    * Minor revision of bogotune.
    * Minor revision of bogominitrain.pl

    0.15.3 2003-09-10

    * Fix auto-update ('-u') bug that double registers ham and spam.
    * Revised parsing to ignore additional headers, i.e.
      Resent-Message-ID, In-Reply-To, and References.
    * Fixed maintenance mode (broken during database API rewrite).
    * Added regression test for maintenance mode.
    * Re-organized test framework to put all scripts in src/tests,
      all input files in src/tests/inputs, and reference outputs
      in src/tests/outputs.
    * Correct QDBM optimization problems arising from API change.

    0.15.2 2003-09-07

    * Header line unfolding now handled by flex rules.
      Special thanks to Michael O'Reilly for his help!
    * Fatal flex errors are now caught and bogofilter exits
      gracefully after closing its database(s).

    * Initial release of RISC-OS support, including qdbm and tdb.
    * QDBM is now supported.
    * The data base configuration has changed. --with-tdb is gone,
      use --with-database=db, --with-database=tdb or
      --with-database=qdbm instead.

    * Updated bogowordfreq to work with bogoreader.

    0.15.1 2003-09-03

    * Check for xmlto during configuration.
    * Fix problem in empty line parsing rule.
    * Fix string termination problem for bulk mode paths.
    * Limit size of unfolded header lines.

    * Allow -I to be used with file or directory.
    * Revise flex rule for encoded text to reduce program size.

    * Revise flex grammar:
      - to reduce size of generated rules
      - to simplify handling of header tags and mime parts

    * Clean-up message header processing:
      - Don't tokenize message separator lines.
      - Merge whitespace separated encoded words.
      - Unfold header lines.

    0.15.0 2003-08-30

    * Implemented a new, more robust, mail reading module that knows how
      to split a mbox into messages and read Maildirs.
    * Implement support for MH directory (such as used by Sylpheed).

    * Change mime boundary line to operate on raw input,
      i.e. before decoding it.
    * Revise mime processing to cure "fatal flex scanner internal
      error--end of buffer missed".
    * Restore parsing rule for ending a "loose" html comment.
    * Add charset map for windows-1251 to KOI8-R (Cyrillic).

    * 64-bit printf files for %*s string formatting.

    -- 
    David Relson                   Osage Software Systems, Inc.
    relson@osagesoftware.com       Ann Arbor, MI 48103
    www.osagesoftware.com          tel:  734.821.8800
    ##########################################################################
    # Send submissions for comp.os.linux.announce to: cola@stump.algebra.com #
    # PLEASE remember a short description of the software and the LOCATION.  #
    # This group is archived at http://stump.algebra.com/~cola/              #
    ##########################################################################
    

  • Next message: Frederick Noronha (FN): "OpenSource TelephonyConference (fwd)"