RE: The Case For XML in "Caller-ID for Email"

Meng Weng Wong [mengwong(_at_)dumbo(_dot_)pobox(_dot_)com] wrote:

The Mystery Stakeholder's name for their answer to SPF is "Caller-ID
for Email". 
[...]
This message comes from the system architect who designed Caller-ID.

[...]
Here's a summary of some of the reasons we used XML in Caller-ID for
Email and continue to think that that is important. In no particular
order: 

1.  We believe it's critical to have an architecture for
uncoordinated extensibility of the information published about a
domain's email policies. Once deployed, we expect (and indeed hope)
that others will build on top of what is initially defined with new
ideas and functionality. They need to be able to do this without the
need to act through some all-powerful central coordinating authority
yet still be assured that their extensions both won't conflict with
those of others and also won't disturb the operation of existing
non-extension-aware interpreters of the data. XML already has a
flushed-out and mature architecture for doing this (its namespace
support and the wildcard infrastructure in XML Schema are critical
pieces of it), one that was developed through a significant learning
curve that would be both arduous and error-prone to repeat.


To reiterate myself:  What's the difference between

| v=spf1 mx -all
|
|   becomes
| 
| v=spf2 mx newfeature:foo -all

and

| <spf xmlns='http://spf.info/1'><mx/></spf>
| 
|   becomes
| 
| <spf xmlns='http://spf.info/2'><mx/><newfeature>foo</newfeature></spf>

with regard to extensibility?

I just don't get it.

2.  There already exists an incredible variety of deployed XML
parsing tools available in a wide array of languages on virtually
every platform anyone might care to want one for. These help raise the
interoperability bar, in that by using these tools applications can
avoid introducing inadvertent lexical and scanning problems due to
bugs or specification ambiguities: the XML tree model semantically
projected by these tools (it's so-called "document object model")
makes it more difficult for these problems to creep in. Among other
issues, for example, complications caused by issues of character set
encodings are already handled. This even helps avoid things like the
buffer-overrun errors that have lead to so many security alerts in
recent years :-).


There already exists an incredible variety of deployed ASCII parsing tools 
available in a wide array of languages on virtually every platform anyone might 
care to want one for.

To reiterate myself:  It's easier to *write* a working SPF parser that it is to 
even *learn* XML.

3.  These deployed parsing tools build upon the quite mature and
polished XML syntax and lexical specification
<http://www.w3.org/TR/REC-xml> , again helping to assure
interoperability. The API of these tools is often based on the equally
well established document object model <http://www.w3.org/DOM/> ,
helping to assure portability of client code.


I consider the current SPF syntax to be quite mature and polished as well.  
Since it is an order of magnitude simpler than XML, it's only natural that it 
took an order of magnitude less time to reach that point.

Also, although I'm usually all for visionary ideas, I'm beginning to lose every 
remaining respect I still have for the engineering and design departments of 
Mystery Stakeholder.  Who the hell needs DOM to access DNS records?  This is 
just insane.

4.  The XML Schema <http://www.w3.org/TR/xmlschema-0/>  definition
language exists and is mature. For applications trying to use XML,
like Caller-ID, this provides value by given a means to denote
application-specific syntactic intent in such a way that generic
schema validation tools can verify that the structure of a given XML
document is well-formed and syntactically valid from an application
point of view. The ability to provide this sort of formal syntactic
specification is also crucial to being able to support uncoordinated
extensibility in a robust way: as an interpreter of data, you need to
formally know where someone might put in some new datum you need to
robustly be prepared for and skip over vs. other places where you can
assume more tightly you understand what the data looks like.


It's easier to *write* a working SPF parser than it is to even *learn* XML.  If 
you want, I'll write you some Perl regular expressions that validate the 
current SPF syntax *in less than 10 lines*.  Try to do that with SPF+XML.

5.  Validating parsing engines are also broadly deployed. Beyond
just the syntactic parsing and verification performed by the lower
level engines, applications using validating engines can be assured
that data they are about to interpret conforms to the structural
syntactic that the application expects. As a result, error checking
and validation code in applications is reduced, and greater
interoperability results.


See 2 and 4.

6.  A number of rich auxiliary architectures have already been
defined for XML, notable among them XML Encryption and XML Signature.
Being able to leverage these infrastructures provides powerful
possibilities for future enhancements to Caller-ID. For example, it is
entirely trivial to create a signed Caller-ID Email Policy Document:
one merely uses the XML Signature's "enveloped signature" mechanism
in one of the document's wildcardable extensibility points. This
works with existing tools and existing credential management
mechanisms, and all that using only a handful of lines of new code to
be able to put it all together. Being able to selectively encrypt
parts of a policy document while keeping other parts open is also
quite likely of significant interest to some publishers. All this
just comes architecturally for free; duplicating the designs for a
custom data model would be a huge amount of work. Similar synergies
also exist with other XML architectures, such as XML Query and its
relation to databases, though the utility there is less viscerally
obvious.


Why would anyone want to *publish* encrypted DNS records?  Use DNSSEC for 
signing DNS records, if you really need it.

"All this just comes architecturally for free" -- yeah, right.  WTF?  I'm 
starting to get really, really angry about these massive amounts of ignorance.  
So using the horridly complex (compared to what we want to achieve) XML 
standard instead of a simple regular grammar comes "for free".  Sure.  Go away, 
Mystery Stakeholder.

7.  There is a huge vibrant industry out there building a large
variety of XML tools to address various needs. A product like XML Spy
<http://www.xmlspy.com/> , for example is a powerful XML development
environment (XML Spy is what I used to produce the XML syntactic
diagrams in the Caller-ID specification), and it has at least a dozen
significant competitors. Other authoring environments like Visual
Studio and other text editors already have XML text coloring and
optimized keyboard navigation built-in. Internet Explorer has nice
hierarchical XML rendering and browsing available just by loading a
.xml file. The list goes on. Also, various transformation tools and
architectures like the XSLT <http://www.w3.org/Style/XSL/> 
stylesheet language provide rich declarative means by which raw XML
data can be automatically transformed into other forms such as
human-presentable HTML.


Oh, the XML buzzword factor.  And did I mention the hordes of ASCII viewing and 
editing tools out there?

8.  There exists a large extant body of technical professionals
already educated and trained in XML.


See ASCII.

9.  There exists a dedicated industry (books, seminars, etc) of
companies and people working to educate same and grow this pool. The
investment put into educating oneself in XML is leveraged knowledge
beyond just administering a spam-deterrence infrastructure that can
help one's career grow and expand outward.


Who needs XML books and seminars if one can just read one page of "SPF syntax 
description" and then write pure ASCII?

All these points made by Mystery Stakeholder sound like copy'n'pasted from 
http://www.w3.org/XML -- but using these arguments, I could justify expressing 
*everything* in XML, even ASCII (as Dan Boresjo said), or XML 
(<opentag>xml</opentag> <opentag>foo</opentag> <attribute>bar 
<value>zippo</value> </attribute> <closetag>foo</closetag> 
<closetag>xml</closetag>).

What's really missing here is an objective consideration of the *disadvantages* 
of using XML in DNS, plus an objective consideration of the *advantages* of 
using a far simpler pure-ASCII syntax.

As long as Mystery Stakeholder refuses to do these considerations (e.g. by 
taking part in active discussions), I think it's no more use for me (or anyone 
else here) arguing against their obviously inconsiderate arguments.

-------
Sender Permitted From: http://spf.pobox.com/
Archives at http://archives.listbox.com/spf-discuss/current/
Latest draft at http://spf.pobox.com/draft-mengwong-spf-02.9.4.txt
To unsubscribe, change your address, or temporarily deactivate your 
subscription, 
please go to 
http://v2.listbox.com/member/?listname(_at_)���v¼����ߴ��1I�-�Fqx(_dot_)com