mail-ng
[Top] [All Lists]

Re: Use of XML as a basis for e-mail

2004-02-03 02:21:25

Chuq Von Rospach <chuqui(_at_)plaidworks(_dot_)com> writes:

This example clearly shows one disadvantage with XML, it
requires approximately twice as much space for the same
information as ABNF as used in e-mail.

and given that we're going to be attaching a 20K html part and a 250K
jpeg graphic of the kid's birthday -- who cares? Keep it in
perspective with what's being sent via email these days and where
email is going. Rich content. In 1975, headers were 40-50% of an
average message. Now, if they're 2% I'd be amazed (I haven't analyzed
it in a while).

This runs contrary to your earlier post discussing whether IM should
replace mail.  I think it will always be a minority of emails that
carry rich content - for one thing, at least half will say things like
"thanks for the photo of the kid's birthday!".

I think it's still *way* too early to be discussing this sort of
detail, but for what it's worth... I understand the temptation to say
that the world should settle on ONE generalised tree format and use it
for everything, to bring some control to file format hell.  I have a
lot of sympathy, but a big point of disagreement.

The world needs TWO generalised tree formats.  It needs one very rich,
expressive one that's good at handling text and whose fundamental unit
is the character, and one very simple, sparse one based on octets that
easily and efficiently handles arbitarary binary content and which is
incredibly easy to parse.  I'd argue against proposals to use a format
like S-expressions for an application like DocBook, but I don't want
something as complex as XML to be the basis of email.

If you make XML the format you'll have to work out how to contain that
250K JPEG.  Hopefully you won't decide to BASE64 encode it - that
really would be madness.  So you'll have to work out a format like
DIME [1] to be the outermost format.  If we're preserving the ability
to forward each other entire email messages in such a way that our
MUAs can understand what's going on, our MUAs will have to be able to
handle the tree structure implied by these outer encapsulations - so
we still end up embedding a parser for a non-XML, binary-oriented tree
structure in our mail programs, which is the whole thing this "XML to
rule them all" creed is supposed to avoid.

[1] http://msdn.microsoft.com/msdnmag/issues/02/12/DIME/

Also, canonicalisation in XML is a big pain, but a necessary part of
being able to use cryptographic signatures.  I want a format that
makes it convenient to have one subtree be a signature for
another subtree.

Incidentally, I'd go so far as to say that the textual payloads of
emails *should* be an XML-based format.  That's the sort of job for
which XML is well suited.  But I want this payload to be an entire XML
document, not a subtree of a larger XML document describing the email.
-- 
  __  Paul Crowley
\/ o\ sig(_at_)paul(_dot_)ciphergoth(_dot_)org
/\__/ http://www.ciphergoth.org/