Setup:
procmail v 3.22
Spamassassin v 3.3.2
Debian linux (testing)
I'm working in a procmail sandbox trying to teach spamassassin to do a
better job of recognizing spam. $sandbox/.procmailrc calls spamc like so:
,----
| :0fw
| | /usr/bin/spamc
|
| :0:
| * ^X-Spam-Status: Yes
| spama_spam_.in
|
| :0
| post_sa.in
`----
I'm running procmail like this:
cat mixedmail_3000m| formail -e -s procmail -m ${sandbox}/.procmailrc
I have piles of spam/ham mix, some 60,000 messages, heavily leaning to
spam, that I've accumulated.
I want a little coaching on running a spamassassin learning session.
First, I'd like to know if it matters that I have autolearn disabled in
SA config file for the duration? I want to manually feed SA the spam
and ham so assuming I'd want autolearn off.
Another thing I wonder:
I planned to feed 3 thousand messages to SA then pick the falsely
filed ham out by hand and feed it again in an
`sa-learn --mbox --spam falseham' command.
Now if I run the same several thousand messages thru SA as incoming
mail for the 2nd time, will SA do a better job of separating the ham
and spam? Or do I need to use different unprocessed mail for the
second run?
How many messages would make an effective learning session?
____________________________________________________________
procmail mailing list Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)de
http://mailman.rwth-aachen.de/mailman/listinfo/procmail