Re: bayesian filter training question
From: Roberto C. Sanchez (roberto_at_familiasanchez.net)
Date: 09/30/05
- Previous message: Simo Kauppi: "Re: permissions below /dev/ across reboots"
- In reply to: Kjetil Kjernsmo: "Re: bayesian filter training question"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Fri, 30 Sep 2005 07:36:05 -0400 To: debian-user@lists.debian.org
On Fri, Sep 30, 2005 at 09:14:53AM +0200, Kjetil Kjernsmo wrote:
> On torsdag 29 september 2005, 21:51, Roberto C. Sanchez wrote:
> > So, I finally decided to get with the 20th century and install
> > spamassassin (acutally spampd hooked through postfix) to do site-wide
> > spam filtering for my server.
>
> Yiiihaaa!
>
> > My question is this. As I am training
> > it with sa-learn, is it (good|bad|indifferent) to train it on spam
> > that has already been flagged as spam. That is, will this reinforce
> > spamassassin's notion of spam or ruin it?
>
> No, that's fine. In fact, SA has this autowhitelist concept that does
> exactly that (it's not really a whitelist, though, more an "evening out
> weird things that may happen", I'm not using it).
>
> You should have a good look at bayes_ignore_header, so that it won't
> train on things that are obviously in spam. SA is pretty good it this
> itself, but if you see spam that has been filtered elsewhere a lot, be
> sure to use it.
>
> I'm guessing that you, like me, are doing this for your family. In that
> case, I have found that it is quite sufficient to train a single
> database with the spam and ham of the entire family. If you have more
> diverse users, you would probably need to have a per-user
> configuration. For example, a friend of mine has an uncle who is a
> psychiatrist working with people with gambling obsessions, and SA was
> pretty catastrophic for him until he got a per-user config.
>
> Finally, I found that SA, in it's default 3.0-form was much too
> conservative about the assigned scores, so I have a bunch of rules that
> I have adjusted the score of. You'll get some experience about that in
> time, I guess. Also note that SA 3.1 has been released upstream.
>
Cool. Thanks for the quick informative reply.
-Roberto
-- Roberto C. Sanchez http://familiasanchez.net/~roberto
-- To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
- application/pgp-signature attachment: stored
- Previous message: Simo Kauppi: "Re: permissions below /dev/ across reboots"
- In reply to: Kjetil Kjernsmo: "Re: bayesian filter training question"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|
|