Re: getting 2822 protection as well as 2821 protection

Dustin D. Trammell wrote:

So how would you differentiate the above example from this one:

Envelope: i(_dot_)am-spammer(_at_)jimramsay(_dot_)com(_dot_)spammer(_dot_)net
From: i(_dot_)am(_at_)jimramsay(_dot_)com
Reply-to: 
i(_dot_)am(_dot_)freshened(_dot_)on(_dot_)the(_dot_)spamlist(_at_)jimramsay(_dot_)com(_dot_)spammer(_dot_)net
Sender: i(_dot_)am(_at_)jimramsay(_dot_)com

I would consider that "not close enough" because'jimramsay.com.spammer.net' is obviously not in the same domainhierarchy as 'jimramsay.com'. I suppose the "right way" would be tomatch from right-to-left a few levels (more than just 1!) instead ofleft-to-right:


Comparing to 'jimramsay.com':
'holmes.jimramsay.com' would match
'01.02.03.04.jimramsay.com' would match
'mail.yahoo.com' would not (need more than just '.com' to match)
'jimramsay.com.spoof.org' would not

Or something similar.  Not enough similarities?  Enough to consider it
'first-class'?  I think that if your doing interesting things with your
envelope, reply-to, etc., then we shouldn't try to detect this and still
classify it as 'first-class', it would simply fall into the
'second-class' bucket and still be seen by the user as probably
legitimate mail.  If we're getting into the business of classifying
mail, 'first-class' should be absolutely verifiable as legitimate and
anything else would be a lesser class.  In the example of using C/R
systems, it's an unfortunate side-effect that the addresses don't match.

True, that is an unfortunate side-effect that the address do not matchexactly, but the addresses do match within a certain well-defined pattern:


user [ -optionalextensions ] @ [ optionalhostname. ] rest.of.domain.com

I think a pretty good algorithm for deciding whether all the variousaddresses match would be as follows:

1 - Find the shortest user-part of all the addresses to be compared.Call this 'A'2 - Find the shortest domain-part of all the addresses to be compared.Call this 'B'

3 - Score starts at 0

4 - If all the user-parts are exactly the same, score +1. If all theuser-parts start with 'A', score +0.55 - If all the domain-parts are exactly the same, score +1. If all thedomain-parts end with 'B', score +0.5

6 - There are three types of match, depending on personal preference:
    - Lenient match - consider 'first-class' if Score >= 1
    - Conservative match - consider 'first-class' if Score > 1
    - Strict match - consider 'first-class' if Score == 2

A user could choose Lenient match, Conservative match, or Strict matchdepending on what they think is 'good enough'.

I suppose my other question is: What about SRS? Won't allSRS-forwarded mail also end up as not-first-class?


--
Jim Ramsay
"Me fail English?  That's unpossible!"