procmail
[Top] [All Lists]

Re: how to strip SA's message markup?

2004-02-22 09:15:45
On 21 Feb 2004, at 17:17, Gary Funck wrote:
I'd like to write a quick recipe for removing spamassassin's markup. ('spamassassin -d does that, but when piped from formail on a 50,000 message corpus,

Why would you involve formail? Doesn't spamassassin handle mboxes anymore?

 it will literally take all day.)

I recently fed over 250,000 messages through SA and it took less than all day.

I could write a Perl program, and invoke SA's methods directly, but thought that procmail should be up for the task.

If all you want to do is delete the SA wrapper you simply have to remove all the lines in the message up to the first ^Return-Path: line. That removes the SA headers. Then remove the final MIME boundary (it will be the last line with text and will start with ------)

Once that is done, then you need to recreate the From if you are using mbox to store the mail.

However, if you have 50,000 messages tagged as SPAM by SA that are not spam... well, that's a problem as well.

--
"Send beer, words simply can't adequately express your gratitude" - James Sedgwick


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail