[ietf-clear] Getting CSV ready for prime time

  While I am sympathetic to the desire to mark whole domains as never
operating sending SMTP clients, there is a fundamental misunderstanding
here: CSV is about documenting authentication and authorization, not
about preventing email abuse.


The point of my suggestions is to say "the rest of the domain isn't
authorized."  If that's a misunderstanding, you'll have to explain
more clearly what the misunderstanding is.

  We SHOULD be very careful to not give the impression that CSV will
prevent abusive forgery of domains in EHLO strings.


Ah, there's the misunderstanding.  I didn't mention EHLO forgery
because that's not my major concern.  Large ISPs have large ranges of
customer IP ranges each of which has a 100% genuine name but shouldn't
be sending mail.  AOL, for example, has about a million addresses with
names of the form AC958E67.ipt.aol.com.  Comcast has hundreds of
thousands if not millions with names of the form
pcp152920pcs.hamntn01.nj.comcast.net.

Those are the names we need something like a wildcard to catch.  I
realize that many of them apply port 25 blocking, but we all know that
port 25 blocking is at best a band-aid.

A less important but also useful ability would be for me to say "any
host that HELOs in abuse.net is lying," which would, at the moment,
catch many millions of spams a day, including about 200,000 that
bounce and blow back to me.  I realize that bad guys will work around
it, but any anti-spam technique merely narrows the set of ways that
they can send spam.

  The important thing here is to define the meaning of this bit. I
don't believe we _can_ enforce a particular algorithm to test for this;


Sure we can.  I just defined one.

  The algorithm John Levine proposes, though "better than SPF" is
still more than we _can_ (IMHO) expect receiving SMTP servers to do
for every email.


In private mail, I told John that I did some stats on the HELOs in the
spam that got into my mailbox last month, and the large majority of
them had three or less components in the HELO.  The number of possible
extra DNS lookups is small compared to the dozen DNSBL lookups I do
already.

  There's no reason why this particular bit should change very often:
thus it could more appropriately be queried by a proxy service, say,
once a week, and a database distributed which receiving SMTP servers
could query without creating DNS traffic on the 'net.


Yet another service that has to scale and to work all the time?  Ugh.
Why would this be better than what DNS caches do now?  They do cache
negative responses, you know, and a week is a quite normal TTL.

R's,
John

PS:

  I sometimes feel like a voice crying in the wilderness;


Fortunately, you can stop now.