Flow diagram for receiver processing


All, 

I worked out a provisional flow diagram for processing incomming mail as
follows:


[START]
    NIL -> Authenticate

Authenticate
    PASS -> Authorize
    FAIL -> Forwarder
    UNDEFINED -> Filter

Authorize
    GOOD -> [ACCEPT]
    BAD -> [REJECT]
    Insufficient -> Filter

Forwarder [*]
    PASS -> Authorize
    FAIL -> [REJECT]
    Unknown -> Filter

Filter
    GOOD -> [ACCEPT]
    BAD -> [REJECT]

[*] The Forwarder step basically there to recognize the fact that a MARID
(or SPF) fail may be because the message was forwarded. This step may
involve checking of headers for consistency, may involve checks of known
forwarding relationships, may involve proof of consent. Or it might just
always return Unknown because there is no good way to do this step.


Without any MARID deployment the only path through the diagram would be 

[START] -> Authenticate -> Filter -> [ACCEPT] | [REJECT]

Say this results in the following error rates

false positives: f_p
false negatives: f_n


Now say we have a deployment fraction d and we have 100% accurate
authorization and no forwarders. Our error rate is now:

false positives: (1-d) f_p
false negatives: (1-d) f_n

[START] -> Authenticate (-{1-d}-> Filter -> [ACCEPT] | [REJECT]
                        -{d-x}-> Authorize ->  [ACCEPT] | [REJECT]
                        -{x}-> Forwarder -> Authorize ->  [ACCEPT] |
[REJECT])

As d tends to 1 the error rates tend to zero.

This approximation is actually much better for high values of d than for low
values. First we have to assume that the forwarder issue will be solved if
there is significant deployment. Second, the authorization step is likely to
become very accurate since third party accreditors ill only be relied on if
they provide very high levels of accuracy and near perfect predictions that
a message is not spam.

The value for the sender to configure for MARID is much greater, by
configuring for MARID they can ensure that their messages do not become part
of the false positive pile.


To get the actual results we have to work through all the paths in the
diagram. We also have to look at the cases where the authorization procedure
gives the wrong result or no result at all. I will do this later on in a
word document (I have reached the plaintext limit here).


The other part of the story beyond mere accuracy is coverage. so far we have
been looking at using MARID with some form of spam filter in place. The
reason the spammers persist is that they beleive that they can stay one step
ahead of the filter writers. But MARID is much harder to game than a filter
and the actions taken to game it would require very high levels of
expertise, things like theft of actively used BGP address blocks, DNS
poisoning. Things we know how to fix once-and-for-all.

Imaging using MARID with a filter that always returns the result Bad. The
only way to send mail is now to use MARID. I don't think we will get there
for some years, but at some point we will. In the meantime the filter
configurations will increasingly concentrate on reducing f_n even at the
cost of what would previously have been unacceptable levels of f_p.


                        Phill