Re: [Asrg] [ASRG] SMTP pull anyone?

Ravi shankar wrote, On 8/17/09 5:53 AM:



 >DNS is used *as a medium* for various applications that are used to
identify

 >mail as legitimate or illegitimate by various standards of legitimacy,
and a

 >major reason for its use in those applications is to make it feasible for

 >mail systems to do the validation synchronously during the SMTP
session. By

 >using a lightweight, distributed, cached database, mail systems are spared

 >from deferring a message, queuing its validation, remembering the results,

 >and waiting for the sender to offer it in an identical way again. You are

 >suggesting that receivers should take on all the heavyweight
management but

 >retain using DNS for something unspecified. It makes no sense.

Bill,

Today's model is no different from what i have suggested in that they
deploy costly anti-spam

solutions, which utilise probably 10 fold resource than what this
solution will use. By allowing the system to cut most of the spam
through a simple pull mechanism, compares very well against today's
anti-spam software model, which not all can afford.

I don't see how this reduces the effort required on the receiving side incomparison to currently common practices. I do see how it increasesreceiving system effort compared to currently common practices. I suspectthat you don't understand those practices, so I'll explain at length...

It is very common for mail servers to apply multiple threshold criteria(often utilizing DNS) before the DATA command in a SMTP session to decidehow to respond to the earlier commands, often making rejection decisionsvery early. SPF and the most common type of DNSBL can be checked that wayand often are, along with rules like requiring the sender domain to have avalid MX or A record, shunning clients that use idiosyncratically invalidHELO names, etc. This does not require message data analysis, as it is donebefore the message data is offered. After receiving a RCPT command, thereceiver knows the IP address of the sending client, the name it used foritself in the HELO or EHLO command, the envelope sender address, one or morerecipient addresses, and the reject/accept results for any previously namedrecipients. In some cases where extensions to SMTP are used, it may alsoknow some message and authentication metadata. It is quite normal for a mailserver to use those facts and derivative facts (like the existence andcontent of DNS records related to them) to decide how to respond to thatRCPT command.

For many mail systems, anti-spam measures done before the DATA command usingmetadata safely reject a large majority of spam (often a large majority ofall email) and whitelist a smaller stream of messages. This sidestepshigh-cost approaches that parse message data. For example, from the last10,000 connections to my own very small mail server, only 873 messages werepassed to the part of my spam control system that examines the message dataand 35 messages were cleared around that filtering. Obviously I can't get aperfect measurement for accuracy since I can't be sure that every error willbe noticed and brought to my attention, but it has been many months andmillions of messages since the last time I know that system to have rejecteda legitimate message ahead of the data filters and it hasn't protected anyspam from data filtering in the 5 years that I've been doing it. Thatperformance is similar to what I've seen in the larger mail systems thatI've managed for others.

The use of metadata rules (i.e. using envelope and session parameters andtheir derivatives) to reduce the flow of mail into message data filters isnot a new or rare strategy, but rather is an evolutionary remnant of theearliest spam control tactics. For many years, spam exclusion was almostexclusively done before the DATA phase of SMTP because it worked well enoughand because filtering based on message data was more resource-intensive thanit could justify with results. To this day, well-run mail systems whoseoperators are concerned about the resource demands of spam control use theinformation available early in the SMTP transaction to decide whether toallow the sender to 'push' the message itself.

The 'pull' model you have described does not specify any way in which it canimprove on the pre-data filtering that is already being done, but it doesadd a burden to both sides of legitimate transactions: keeping track ofmessage offers that are pending a decision to pull and an actual pullattempt. In order to justify that added burden (in addition to the hugedevelopment and deployment costs) you would need to explain how your pullmodel facilitates better filtering than what sites do now. Sparing systemsfrom message data filtering isn't enough, unless you have some case for yourmodel doing that consistently and sustainably better than current tacticsthat operate during the SMTP session.

 >The *most* that SPF can provide towards showing "legitimacy" is to confirm

 >that the envelope sender address of a message is not forged. It is
very rare

 >for large senders of any sort to deploy records that can do that strongly.

 >There is nothing about SPF that directly attacks spamming. It could in

 >theory be used to attack sender forgery, but the collateral damage has

 >proven to be too great for either sending or receiving systems to actually

 >apply it strongly to that end. Meanwhile, a lot of spammers are sending a

 >lot of spam with senders that are validated to the degree that SPF can

 >validate anything.

Actually SPF only validate the legitimacy of the sender IP and domain
relation and i mentioned SPF as just a example.

SPF is specified as applying to the whole envelope sender. Explicit recordsusing the %l macro are rare, but many domains assure that the hosts theyaffirm in SPF are using correct local parts in sender addresses. That iswhat would be expected with normal MTA software and configurations thatcould be affirmed in SPF.

And if the large senders
cannot implement something as simple as a TXT record for SPF (leave
alone DKIM), then probably they do no care about spam.

I understand that it is easy and tempting to be dismissive about the lack ofcare among large senders, but it is self-defeating when trying to devise andevangelize a new spam control mechanism.

It is worth noting that Microsoft (as Hotmail) has been the most importantactor in getting SPF records deployed by others, even though Hotmail systemsare chronic spam sources and their inbound mail systems do not use SPFrecords in anything like a normal way.


> SPF or DKIM are

only effective when deployed by all the domains that send mails.

That is a ridiculously false statement. I have to assume that we are havinga problem of differing idioms of English, or else I would think you a fool.

 > 4. The sending server then hands over the message.

 > 5. To overcome DDoS attacks, the receiving server can be made to request

 > the next 10 or so Message IDs that it will assign to messages,

 > so that if a attacker tries to give those details, it will know from the

 > next list of message IDs that it's fake connection.

 >>>That sentence makes no sense. What did you mean to say?

What i mean is in order to prevent a system from getting overwhelmed, by
anonymous submission, if for say domain1.com server knows the next 10
message ID that will be sent by domain2.com, then it can confidently
reject those message submission attempts that does not have any mails in
this range (ofcourse this logic holds only if domain2.com is going to
send those 10 message IDs domain1.com only)

Okay, so you are redefining "Message ID" as a new identifier defined by eachMTA for each message that it handles, rather than as something related tothe Message-ID mail header.

That concept is interesting, but it is not consistent with how mail systemswork today. It brings into question whether you have a useful understandingof the range of ways that people use email and the range of ways that mailservers handle mail. The practices that would have to end in order to enablethis facet of your idea include those which forced SPF into its arcanecomplexity and those which constrain its strength and deployability today.

 >Nothing you have described would add to spam control as it is currently

 >being done, as far as I can see. The 'model' is too vague to critique inn

 >detail because you aren't really providing any meaningful details.

 >In order to bring anything truly new and useful to controlling email
spam, a

 >new idea has to either attack spam in a way that existing tactics
don't, do

 >a demonstrably better job than existing tactics, or overcome the negative

 >aspects of existing tactics. You have identified none of those in your new

 >idea.

I guess we are expecting a magic solution that will stop all the spam in
a single go and would not require us from changing our system
continuosly.

Not at all, and that is part of why I am skeptical about your suggestion. Itwould be a radically new way of handling email, to a degree that it wouldnot really make sense to define it as an extension to SMTP.

But unfortunately, every system has flaws and has to be
corrected one step at a time, this i believe is the evolution.

Gradual evolutionary steps have to provide a real hope of some incrementalbenefit to early adopters without doing them immediate harm. Even if you hada fully detailed model for how this would work and had a deployable way tointegrate it today into existing mail systems, you would need to assure thatit would be harmless to offer now (i.e. no rejection of legitimate mail fromnon-users of the new system) and that it could provide some benefit for bothsenders and receivers who adopt it before it becomes widely deployed. Asdescribed, it increases the difficulty of handling mail for both sides andoffers neither side any concrete benefits.

I have done my best to detail how this system applies in various steps
of a mail communication, may be i can work on a pictorial
representation, if someone else requires it as well.

If this is what you consider "detail" then you have a major obstacle tobeing taken seriously. Drawing pictures wouldn't be a step forward. Defininga transaction protocol would be, but I wouldn't suggest you do that untilyou identify concrete ways that your model offers benefits that existingcommon practices cannot offer.

_______________________________________________
Asrg mailing list
Asrg(_at_)irtf(_dot_)org
http://www.irtf.org/mailman/listinfo/asrg