Skip Montanaro wrote:
[...]
In the Spambayes project this stuff is called "word salad". (I doubt the
term originated with us.) The one conclusion we've reached so far about it
is that it generally doesn't bother the accuracy of our classifier.
I've been following similar discussions on the spamassassin and
bogofilter lists. The general consensus is that the use of random words,
or even strings of text from classical literature, actually helps mark
the mail as spam. Afer all, I'm not likely to have strings of classical
literature in my mail, while a lit major wil probably not have as much
technical jargon. Once trained, the bayes tools do a good job
recognizing whats out of place, rather than what's "wrong".
A fixed filter isn't necessarily impossible, but it will be tricky to
avoid false positives.
- Bob
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail