Re: Against Extensibility in MARID Records


Jim,  you raise a few key issues I have with the extension concept.

o Possible 3rd party or Microsoft proprietary or undocumented extensions:
o Transport vs. Application/Admin Level MARID feature/extension set:

I would like to state that I support XML for MARID.  But I hope you can
consider my professional opinion on all this. I will express both concerns
in detail.

o Possible 3rd party or Microsoft proprietary or undocumented extensions:

This one presents what I believe is an ethical and moral business dilemma.
I believe in a concept I call "CoComp" - for Cooperative Competition.  It
relates to the usage of common and standard technology that is standard at a
certain level but also allows for proprietary extensions that will not break
the standard usage yet offers a vendor a way to distinguish themselves.   I
presented this CoComp concept in a 1992/93 talk at the first (or 2nd) ONE
ISP CON trade show introducing a XML-like format (although not called XML
back then) for a common mail storage and transport format between the
hundreds of so mail products and the many different formats a product had to
support between heterogeneous mail networks.  This was before the Internet
Email finally trumped all existing mail systems and formats at the time and
rapidly became the industry standard.

As a Windows shop,  I've been involved with the Windows platform since its
inception.

On the positive side, it should go without saying the history of Microsoft
has been to assist the ISV market to move forward in supporting the Windows
platform by providing "tools," in particular helper APIs.

On the negative side, there is a history of new feature sets that were
either initially proprietary or undocumented.  This is one of the main
reasons why we do most of our tools requirement in-house is so that we are
not strategically dependent on specific items under Microsoft control.

Since you are involved in the development of Exchange, I don't think I will
be completely off based for you to have natural tendency and competitive
instinct to explore and implement new ideas that may make the Exchange
product work better than the competition. In fact, it would expected of you
and based on the way the Microsoft patent disclosures has grown,  this could
be an concern.

But others can do the same too.  We can come up with an extension only our
servers understand.

So I am believer in the CoComp concept, yet I can't help thinking XML will
promote this concept for MARID.  Proprietary Extensions from Microsoft or
other vendors could be used as the differential factor between products.
And of course, since Microsoft is the MCEP inventor and major promoter,   I
have a concern that this can be used against other smaller vendors.  I
sincerely hope you can understand this concern.

Please realized that I believe Microsoft and all others have a natural right
to do this and if everyone thinks this is "ok" for MARID where proprietary
extensions promotes usage, then I am all for it.

But it should be well noted that it may will happen with an extensions
concept.  MARID XML will promote extensions that may only work within
certain product lines which brings me to my final
point or concern.

o Transport vs. Application/Admin Level MARID feature/extension set:

Since the day we started the Anti-Spam research early 2003 to finally
address the abusive spoofing of mail, I began with my first time
participation in the IETF mail forum./discussion areas.  It was my hope to
be part of the new direction and help, assist and provide input where I can.
It didn't take long to see what I believe was a fundamental difference in
the philosophical positions many took on how Anti-Spam technology should be
designed and even implemented in current products.

I wish to strongly state that I wish no one take the following the wrong
way. I mean no ill intent or categorization. I just wish to highlight what I
believe is very important because this key difference in product operation
philosophy is what's driven much of the design thinking in many
mail/anti-spam forums with participants from a wide range of disciplines,
including here in my opinion.

The Dynamic vs. Post SMTP operations people run is one of the key
differences driving or molding the thinking, the debates, the discussions
and designs ideas to address the problem.

If you are running a mail operation where mail analysis and/or extensions
are only possible in a POST SMTP manner, you will have a philosophical
different mindset than the person who is running a mail operation where mail
analysis and/or extensions can be dynamically performed at the SMTP
transport level.

How is this related to MARID and Extensions?

Well, MCEP has a POST SMTP dependency in order to work right because of its
high RFC 2822 requirements.   While a mail system can also support MCEP
dynamically by performing the mail analysis at the DATA stage, this has a
major design change implication at SMTP and it also can impact performance
and scalaribility due to increase payloads.

I view POST SMTP analysis as a Application or Admin level methodology where
non-transport or 2822 entities are just one part of the total MARID package.

Similarly, when it comes to extensions, such as this Report Generator
extension concept, I also believe this is an Application or Admin level
methodology.  Not a SMTP level concept.

So my concern is that MARID extensions can be used to force a design
requirement that may not be appropriate at one level or another.

Lets use the report extension that is used commonly for illustration.  I
believe IBM was used as an example for the desirable MARID behavior using
the report extension.

At SMTP,  a MARID client processes a transaction where the RFC 2821 envelope
information points to an obvious spoof and also MARID policy that will
reject the transaction because the IP is not correct.  A report is
requested.

Example:

IP 1.2.3.4
HELO winserver.com
MAIL FROM: tom(_at_)ibm(_dot_)com

In this example, the HELO domain is spoofed.  winserver.com is our domain.
The IP is not ours.  This is an immediate transport level rejection.

However, to follow MARID, the IBM.COM has a MARID domain policy that request
a report to be sent to IBM help IBM track the abused email addresses.

This level of reporting is outside the whelm of SMTP requirements.  The
application level report generation is now imposed on the SMTP client.
Besides the fact there is more overhead to obtain the report type or format
or extension that provides a report template, in this case, SMTP has a
obvious transaction rejection that should not require any more processing.

My main point here is whatever extensions are invented we need to make sure
that the functionality of the extension is well placed within or separated
from SMTP design requirements.

How you agree with this or not depends largely on your philosophy or current
mode of operation that I was describing above.

If you are running a POST SMTP system, then this isn't so much of an issue
and in this case, I will 100% agree because the report functionality is now
appropriately located in the right system component location - outside the
transport.  A very nice report can be generated with all the time in world
to produce it.

But if you are running a system that is performing dynamic SMTP validations,
this particular report extension is conflictive with SMTP product design and
operation.

I can understand Microsoft design is to use POST operations for total
validation.  So extensions are more affordable - outside the SMTP transport.
Your point of view is well understand - from a Application Design
standpoint.

But for product like ours,  it is a system design dilemma.

Don't take me wrong.  I am looking for a solution to all this.  But
extensions need to be clearly defined and separated per system vs
application component.   You use EHLO for example  All extended features are
for SMTP.  Not outside of SMTP.   You use HTTP request commands.  Its all
within the HTTP protocol.  It is all clearly defined on where the extensions
are used.  That is not to say an extension obtained at the transport level
can not be used to trigger post protocol operations.

But for this need, where in my view, it is highly desirable to block the
high rate of spammers at the transport level,  extensions such as
"reporting" may not be possible or less supported.

-- Hector



----- Original Message ----- 
From: "Jim Lyon" <jimlyon(_at_)exchange(_dot_)microsoft(_dot_)com>
To: "John T Levine" <johnl(_at_)iecc(_dot_)com>; "IETF-MXCOMP" 
<ietf-mxcomp(_at_)imc(_dot_)org>
Sent: Thursday, June 17, 2004 8:36 PM
Subject: RE: Against Extensibility in MARID Records


In arguing against extensibility, John Levine argues that it's a bad
thing (he used the word "chaotic") to have information in a MARID record
that is not understood by everyone.

To rebut this, I note that substantially every successful data format
and protocol contains buckets for information that isn't globally
understood.  For example, consider headers in RFC 2822 mail messages.
The general ethos is that if you see a message that contains a header
you don't understand, you just ignore it.  Without this ethos, many
useful extensions (MIME, for example) would have been impossible.

A similar story applies to the headers in HTTP. Without the ethos of
ignoring what you don't understand, many commonly used features would
have been impossible.

Similarly, in RFC 2821, responses to EHLO contain a list of extensions,
many of which the client doesn't understand.  Again, the ethos is that
you completely ignore anything you don't understand.  Doing so is
exactly what enables SMTP extensions.

Or look at DNS.  In a DNS packet, there's only a single unassigned bit,
and some deployed DNS software gets pissed off if it's not zero.  This
fact, more than any other, has hampered attempts to expand the DNS
protocol.

Or look at the difference between ASN-1 and XML. ASN-1 can describe
anything for which the entire schema is known in advance, but it's hard
to extend a schema after the fact.  XML, on the other hand, has the
clear ability to carry goo whose semantics are not known by older code.
Guess which has been more successful.

Given that we *know* that we'll need more information in the future
(most of us are here to reduce spam, not just authenticate MTAs), it
behooves us to plan the extensibility now.  SPF's current modifiers are
a step in the right direction (they have the ignore what you don't
understand ethos), but as I've argued elsewhere, they aren't sufficient.

The reason they're not sufficient has to do with cases where you need to
attach new pieces of information to particular SPF mechanisms; SPF
merely gives you a way to attach new information to the whole record.
For this same reason, John's suggestion of just having a pointer to a
separate XML document isn't sufficient, either.

-- Jim Lyon

PS: John also continues the misconception that anything that uses XML
will be big enough to require DNS TCP. This just isn't true.  It looks
like most XML-encoded stuff runs about 20% bigger than SPF-encoded
stuff.  The size concern might be persuasive if typical SPF records were
pushing 400 bytes, but they're not.  They're usually around 50 bytes.
The number of characters in the XML framing is just not that big an
issue.  When you're spending 18 bytes to represent an address range, it
just doesn't matter that much whether you frame it with "+a:" and space,
or with "<r>" and "</r>".