Why not XML



Ok folks, I'm way over the 3-post/day suggested limit, so this is my
last post for a while.


In 
<C6DDA43B91BFDA49AA2F1E473732113E5DBE22(_at_)mou1wnexm05(_dot_)vcorp(_dot_)ad(_dot_)vrsn(_dot_)com>
 "Hallam-Baker, Phillip" <pbaker(_at_)verisign(_dot_)com> writes:

[quoting out of order:]


What you are trying to do here is to constrain the problem to the size of
your tool. That is not a very good approach when you are trying to deal with
a problem with a legacy system.


Yes, I am trying to constrain the problem to the size of the tools I'm
using.  In particular, I'm using DNS.

Once upon a time, the "exp=" modifier was part of the main SPF
record. (exp= provides text that explains why an email was rejected by
SPF.)  Since the explanation text is often very long and since not
everyone would want to see it, I argued that it should be put into a
separate location.

Likewise, most of these extensions are not going to be wanted by
everyone and some are quite large.  Adding short pointers to this info
in the main SPF record is, IMHO, the way to go.

[long list of "use the modifier" snipped]


Which illustrates my point, you need structures and so you have cobbled
together an ad-hoc internal syntax to meet this need.


I never said that the syntax had to be different.  You did not give
enough details about what you wanted for me to show an exact
translation.



There are lots of very solid technical reasons to not want XML:

* The records *are* slightly larger, especially with the original
  Caller-ID version of the spec.  Since the number of bytes is so
  limited, this is a problem.  Not a killer problem in and of itself,
  but you can't dismiss it either.

* In order to save bytes, the XML has to be scrunched together, making
  it very hard to read and modify.  Whitespace makes things much more
  readable, and SPF uses whitespace.

* In order to save bytes, most of the parsing isn't done via XML, you
  still have to have a complete SPF parser in order to do anything.
  You still have all the problems with creating an ad-hoc parser, but
  you have added all the problems of using an XML parser.

* XML solves the syntax extension problem, but not the semantic
  extension problem.

* The XML parsers are often very big, compared with the MTAs.  When
  you double the amount of code, as in the case of qmail and libxml,
  you are more than double the chances of a security hole or allow for
  some sort of abusive XML document.

* Two syntaxes means two different code paths that have to be tested.

* Two syntaxes means that people who want to publish something have to
  make a confusing decision right off the bat.  Is one better than the
  other?  Why both?  Which should I choose?

* Two syntaxes means that anyone who needs to diagnose a problem with
  the MARID records has to understand both.


There are also some not very technical/rational reasons:

* Many mail admins don't like/understand XML.

* XML is promoted by MS a lot, and many mail admins don't like MS.


These latter two reasons have somewhat overshadowed the more
technical reasons.  I think some people assume that when the technical
points are being raised, they are really to hide the irrational
reasons and get pissed off by this.  This, in turn, pisses the people
who feel they have raised valid technical concerns and are being
dismissed for being "irrational."


XML is a great tool, but it is not a great tool for all jobs.


-wayne