spf-discuss
[Top] [All Lists]

Re: Re: [draft-schlitt-spf-classic] Small change in Received-SPF heade

2005-01-05 07:30:43

----- Original Message ----- From: "Stephane Bortzmeyer" <bortzmeyer(_at_)nic(_dot_)fr>
To: <spf-discuss(_at_)v2(_dot_)listbox(_dot_)com>
Sent: Wednesday, January 05, 2005 8:22 AM
Subject: [spf-discuss] Re: [draft-schlitt-spf-classic] Small change in Received-SPF heade


On Wed, Jan 05, 2005 at 02:13:38PM +0100,
Julian Mehnle <bulk(_at_)mehnle(_dot_)net> wrote
a message of 29 lines which said:

You don't want to apply bayesian filtering to raw e-mail headers,

On the contrary, you must do it (see
http://www.paulgraham.com/better.html):

Markovian is better. See http://crm114.sourceforge.net for details. (I used to work with the guy who wrote that, he's an old friend.)

But I think the most important difference is probably that they
ignored message headers. To anyone who has worked on spam filters,
this will seem a perverse decision. And yet in the very first filters
I tried writing, I ignored the headers too. Why? Because I wanted to
keep the problem neat. I didn't know much about mail headers then, and
they seemed to me full of random stuff. There is a lesson here for
filter writers: don't ignore data. You'd think this lesson would be
too obvious to mention, but I've had to learn it several times.

They are full of random crap, and don't work well against the various email worms and network of spam zombies available now. But they do work well when spammers and email worms use consistent syntax and software for their misbehavior.

One was *supposed* to use that header haiku to verify that one's email was legitimate, for example, but it quickly turned into a sign that the email was actually spam. I expect exactly this to happen with SenderID: spammers will easily use it to "verify" the legitimace of their email, or use stolen accounts with valid SenderID tools, and it will be a strong indicator that the email is in fact spam rather then the planned indicator that the email is legitimate.