Re: a header authentication scheme


On Oct 20 2004, Arnt Gulbrandsen wrote:

The idea sounds good to me.

Laird Breyer writes:

For example, if the topmost Received: line (at M3) is

   Received: from 185129182.virtua.com.br (185129182.virtua.com.br
       [200.185.129.182]) by smtpin-3211.bay.webtv.net (WebTV_Postfix+sws)
       with SMTP id 7280B11DCC; Fri,  5 Mar 2004 19:00:48 -0800 (PST)

then a Processed: header of the form

   Processed: name="SpamAssassin"; location-ip="1.2.3.4";
       version="2.63"; function="spamcheck";
       auth-received="7280B11DCC"; result-tag="spam";

is guaranteed to have been added *after* the Received: line was 
inserted in the message, ie at or after M3. Such a Processed: line is 
unforgeable, unless the auth-received value can somehow be predicted 
with high probability.


Actually, the spammer only needs to try harder. If he can predict it 
with 1% accuracy, he only needs to deliver 150 spams to have 80% chance 
of passing the test. I've seen spammers try much more than 150 times.


What you say is true in general, but the whole point is to see whether
the guessing game is difficult enough to be impractical, so I'll give some
arguments against your point so we can understand the issue better.

The general tactic for a spammer is to guess the ID which will be 
added to that particular instance of submitted mail. For the protection
to be effective, this guessing game has to be very hard, ie as difficult
as a cryptography guessing game.

Now each mail passing through the smtpin-3211.bay.webtv.net server
will get its own unique ID number when the Received: line is written,
and it won't do for the spammer to guess some other message's ID. So
if the spammer sends 150 mail messages, each one that gets through
will receive a different ID by the receiving SMTP server. 

A typical example where spammers submit 150 or more messages is when
they are fishing for email addresses. They'll start at the top of the 
alphabet and try variations of email addresses. Since these email
addresses don't change over short time frames, once they hit a working mail
address, the spam gets through. 

With the authentication by ID (and similarly by date stamp), a spammer
needs to be able to predict the sequence of IDs used by the SMTP
server at the precise moment when the message reception is being
negotiated, which I believe is a lot more difficult. I would expect
that guessing an ID such as 7280B11DCC correctly at a busy server
would require much more than 150 tries between successes, certainly an
order of magnitude more work than guessing an email address.

And once an ID has been guessed successfully, its value is useless for the 
next message, since that will have a different ID altogether.

But I don't know how SMTP implementors generate ID numbers. If somebody
can explain this, it would be useful.


But since the receiver can make predicting arbitrarily difficult, I 
think it's okay. SMTP receivers seem fairly eager make life difficult 
for spammers. Quelle surprise.


Yes, it's possible to implement variations of the idea, but what's nice
with what I'm presenting is that it slots directly into the existing
RFC 2822 framework, so could be implemented today and everywhere.

-- 
Laird Breyer.