ietf-mxcomp
[Top] [All Lists]

Sender ID and CSV are complimentary (long - sorry) (was: Unified SPF overlaps with CSV)

2004-06-29 17:03:00

"Douglas" == Douglas Otis <dotis(_at_)mail-abuse(_dot_)org> writes:

    Douglas> What identity is used to assert accountability?  Due to
    Douglas> this weak from of identity, SPF/CID _MUST_ never be used
    Douglas> to assert accountability.  If mail was injected somewhere
    Douglas> inside a nebulous region devoid of SPF/CID checks, how is
    Douglas> this detected?  Are those spoofed to be soiled by a
    Douglas> machine somewhere not tied down properly?

Having been skimming this thread, and not entirely understanding what
it was about, I'll have to say I think I agree with this point,
although I don't think it's very clearly stated.

I'm starting to understand why CSV could be potentialy valuable...

NB: The following discussion relates to the drafts that have been
submitted to this WG, and to the published SPF ID
(draft-mengwong-spf-01).  I'm aware of some discussion on the SPF list
and subsequently here about "Unified SPF" but no ID yet exists to my
knowledge; I'll leave my comments about that till the end...

Both draft-mengwong-spf-01 (SPF) and draft-ietf-marid-core-01
(SenderID) are to do with authenticating *messages*.  If you receive a
message that fails the SPF or SenderID checks, you know the *message*
is forged, but you know nothing about who forged it.  The fact that an
MTA sends you a message that fails SPF/SenderID checks tells you
nothing about the propriety of the MTA that sends it to you; just that
a forged message was injected into the mailstream at some point, and
that MTA (quite possibly perfectly innocently) happened to attempt to
deliver it to you.

draft-ietf-marid-csv-* authenticates the sending MTA directly (but it
doesn't authenticate the message in any way); if you get CSV
authentication failures, you know that it's the immediate upstream MTA
that's spoofing.  But the fact that the MTA is genuine doesn't prove
that the message is genuine.

What does this gain you in itself?  Probably not much.  It wouldn't be
much effort for spammers/phishers/whatever to give a valid HELO
identity in a throwaway domain registered for that purpose.

But coupled with reputation/accreditation services, it becomes more
valuable.  Doing reputation/accreditation by IP address is problematic
at best -- it interferes with things like network renumbering,
deploying additional MTAs, etc, and has problems when IP addresses are
reallocated to other entities.

Having an authenticated name for the delivering MTA is a more flexible
way of attaching reputations/accrediation to MTAs than the IP
blacklists/whitelists (largely blacklists) we currently use.

And filtering at that level is clearly useful.  Most of us use IP
blacklists.  And we're now living in a world where open relays no
longer account for the majority of spam.  Most spam is now direct to
MX or delivered via open proxies (in both cases with compromised
machines often being used).  So establishing reputation at the
delivering MTA level would seem to be a valuable tool, and having a
mechanism for doing so without requiring that ISPs never renumber
their networks for fear of losing their reputution is clearly a good
thing.

(Obviously reputation at this level is a defense against open relays
too, it's just that they're no longer the main threat).

It seems to me that SPF/SenderID and CSV solve very different
subproblems of the problem space that this WG is interested in, and in
particular that they may end up using very different notions of
reputation/accreditation.

In a scheme like SPF/Caller ID/Sender ID you're looking at reputations
of e-mail addresses (or at least their domains).  Does this domain
originate mail that is legitimate, or does it originate spam?  People
blacklist/whitelist now on MAIL FROM and From: but it's problematic
because it's easily forged.  If you can prove an association between
the message and the domain, such blacklisting/whitelisting becomes
more reliable.

Also, at least some of the proposals (namely Caller ID and the Sender
ID) are potentially very useful components in an anti-phishing
strategy -- ie validating the domain that is displayed to the end user
-- even if such strategies also require MUA support.  And
anti-phishing is clearly something that many people want.

In CSV/CSA you're looking at reputations of originating hosts.  Is
this a well run MTA, or is this some random bot, proxy or open relay
that's opening an SMTP connection to me?  Granted the fact that it's a
well run MTA doesn't prove that forged messages or spam won't pass
through it, but in the world we live in a very small proportion of
unwanted mail is originated from well-run MTAs, so reputation at that
level certainly seems valuable.  Most of us do this already based on
IP blacklists.  Blacklisting/whitelisting (particularly when you get
to whitelisting) on an authenticated name of the host seems far more
flexible.

Based on the above, which I realize might involve a misunderstanding
of one or both proposals, it seems to me that Sender ID and CSV
attempt to tackly essentially unrelated subproblems within the MARID
problem space.

I think a lot of the confusion in this and other threads stems from
the fact that the two proposals tackle different subproblems, and many
WG participants are essentially focussed on just one of them.

In fact, it seems to me that the models of reputation and accrediation
services that might evolve within these two problem areas are
potentially quite different.

It also seems to me that Sender ID and CSV can be deployed
independently; both publishers and checkers can decide to implement
one, the other, or both.

Given that, it's not clear to me that it's appropriate to burden CSV
with the complexity of an SPF record, since people may wish to deploy
CSV without Sender ID.

So, in answer to the question:

   - Due 2004-07-02: Decide if CSV is complimentary, parts to be 
     incorporated, or dropped.

My current gut feeling is that this WG should advance
draft-ietf-marid-core/submitter and draft-ietf-marid-csv-* to PS
without trying to unify them technically.  I think the only
unification that should happen is (if feasible within the timescales)
is in terms of a unified intro/rationale document that describes the
framework in which both exist.

Part of the reason for my position is that they address different
subproblems of the MARID problem space, and that I don't think there
is any clear technical benefit in unifying the record formats, given
the simplicity of the CSA record.

There are also pragmatic reasons for this position.  This WG is
working on a very tight schedule, and it also is treading into
unchartered territory.  Despite the SPF adoption to date, nothing has
been implemented on a large scale (current SPF deployement is still,
in reality, a drop in the ocean) and none of us really knows how this
will pan out in practice.  We're not going to get this right first
time, however hard we try; we have to accept that.  Whatever this WG
proposes, I'm sure that there will be much more work to do in the
future (either by this WG or a successor) in the light of operational
experience of those proposals.

Given that the groups of people persuing the Sender ID and CSV problem
spaces are largely disjoint, keeping the proposals largely independent
seems like the most efficient use of this WG's time; and it strikes me
that attempting to unify them technically just for the sake of it is
likely to consume much WG effort for little or no technical benefit.
Only operational experience will tell us the value of MARID in these
two problem spaces, and until we have that, I think a 'unified theory
of MARID' is premature.

Ok, now the promised brief comment on 'Unified SPF':

I'm coming the to conclusion that Sender ID shouldn't be extended to
check the HELO.  SPF checked it only for mail from <>; Sender ID if
I'm not mistaken doesn't check it at all.

But the semantics of checking HELO are very different from what
SPF/Sender ID set out to do.  And it's not clear to me that the
reputation/accreditation services will work in the same way.  Or that
the same reputation/accreditation services will be interested in
dealing with both problem spaces.

The only reasone why SPF checked the HELO (as I see it) was to close
the gaping hole left by mail from <>; CSV seems a better solution to
that, so leave it out of Sender ID.

Sorry to have bored you all with such a long post; hope this makes at
least some sense...

      -roy