spf-discuss
[Top] [All Lists]

Re: The case for XML

2004-01-22 01:27:53
On Wed, Jan 21, 2004 at 01:00:00PM -0800, Hallam-Baker, Phillip wrote:

| Since I am probably the person with greatest XML experience here I should
| probably explain the pros here. I was the editor of the XKMS and SAML specs
| and I am currently working on WS-Security. My first reaction to this idea
| was pretty much what you see on the list from everyone else. It is like
| adding a go faster stripe to a Lada and calling it a sports car. But...
| 
| The cons are fairly obvious
| 
|       * It is a change
|       * The result is more verbose
|       * Network administrators are not familiar (Human readability issues)
|       * It is not the way DNS does things
| 
| The pro's need a bit of explanation
| 
|       * Better extensibility model

SPF is not where extensibility has great value.  We have to deal with a finite
set of semantics in order to ensure that those semantics are implemented and
deploy on a nearly universal basis.  Sure, you can extend XML to add new
things which can be (by definition in the XML world) ignored if unrecognized.
But ignoring an SPF mechanism can cause problems that we don't want to happen.
Extensions in SPF need to be properly vetted in a standards process, and if
decided in favor, scheduled for careful deployment, perhaps as a new version.


|       * Re-use of existing parser infrastructure

While XML parsers are present in many implementations of many languages,
some are incredibly weak.  I suspect the best XML parsers exist for Java
and C#/.NET.  But core facilities like MTAs are not routinely done in
those systems.  While Java, for example, is getting faster, it still is
not able to perform the way true compiled to machine language programs
can do (for example JIT effectively destroys the ability to share code
pages in separate processes, forcing programs into the more volatile
threaded model).

I'll still write code in C because I can achieve peak performance there.
I have projects on the plate right now for a high performing SMTP/LMTP
daemon, and a high performing DNS server with instantaneous real time
dynamic updates at the same time.  At present there is no implementation
of XML parsing for C that is reasonably bug free, and is designed to be
suitable for high performance.


|       * Programmers are more familiar (Human readability issues)

There is an incredible difference in the work level required of programming
and of coding a designated sender policy.  For ever SPF TXT recorded coded,
there will probably be millions of lines of application code written.  Yes,
programmers need tools to make their work more efficient.  SPF falls more
into trivial configuration.


|       * Avoid emergence of competing spec with significant, well funded PR
| effort
|       * Well funded PR effort instead supports SPF.
|       * It is the way any DNS replacement will do things.

This part isn't clear.


| It would be really useful if lurkers from the big filtering companies could
| chime in here. Yes we know you are out there. 
| 
| And in response to one point about folk comming out of the woodwork to
| suggest alternative architectures. That is the consequences of success.
| People are starting to look at what SPF has achieved and are trying to
| decide if it is the right solution in the big picture. I am somewhat
| fortunate to work for a company that is small enough to let me loose in this
| type of forum without minders, there are few people in the companies I work
| with to develop specs who have that freedom. You can look at the specs I
| have written and see the companies I have worked with over the past 5 years
| - believe me the major players are engaged here.
| 
| The question at issue here is whether SPF will be the anti-spam solution the
| major players adopt or whether it merely becomes the catalyst to action,
| forcing the various players to put their proposals on the table and come to
| a consensus. That is one possible outcome and from a pure architecture
| standpoint there are a lot of people who would prefer that route. The cost
| there is that it would take a minimum of nine months and we cannot afford
| that.
| 
| That is why I defected to the SPF camp. Nine months off deployment time is a
| big issue for me. My credit card merchants are the ones who hold the bag
| when an identity theft Joe job (sorry impersonation spam) gets sent.
| 
| 
| The Extensibility Issue
| -----------------------
| 
| The big issue we are facing with SPF is the difficulty of extending
| protocols with very poor support for extensibility - SMTP and DNS. One thing
| that XML has done right is to have a good model for extension.

I would agree that SMTP needs some extensibility.  I would agree that XML
can do extensibility.  But I also believe extensibility is overrated.  Of
course some won't, and to the extent they happen to be "major powers" does
not make it any more right.  OTOH, "major powers" often get their way
whether they are right or wrong.  Sadly, they are too often wrong because
they are "major powers" that are too hard to sway.

DNS needs some extensibility, too.  Adding new RRs to DNS is way too hard
for what impact they can be expected to cause.  New RRs can be ignore in
a query type of ANY, and won't be seen unless asked for otherwise.  So in
that case, extensibility doesn't hurt, and would be a good thing.

SPF, however, is different.  Extensibility can be problematic because it
has to be coupled with semantics, or it gets in the way (logic will have to
abort, rather than ignore, when dealing with an unknown mechanism).  Then
the results will not be as expected at each MTA, and inconsistencies will
be all over the net.  SPF will break.


| The key to XML extensibility is namespaces. Each XMl document states what
| namespace the markup is declared in using the xmlns="--some uri--"
| mechanism. Namespaces can have prefixes so that you can have one markup
| reference another and not have to worry about collisions between tag values.
| 
| What this means is a change in the extensibility model. Today it is possible
| to extend SPF, that is new tags can be defined for the SPF framework. In the
| research document I propose a mechanism that extends SPF to embrace the
| Yahoo Domain Keys scheme.
| 
| There are two difficulties with this approach. First there is barely a
| process for developing SPF. Who gets to decide what goes into SPF 1.1? What
| happens if someone makes a proposal for an extension and the group does not
| think it is a good idea? There is no way we can force them to drop it, they
| can go ahead and publish anyway.

SPF is already extensible at the server side via the "exists" mechanism and
a custom/hacked DNS server designed to implement whatever semantics the new
extension needs.  Every network could have its very own logic because it is
performed on their own server.  SPF would merely pass on the data, let the
DNS server figure it out, and wait for the boolean answer.


| The second problem is that we have tried this mechanism for extension in the
| IETF in the past and it has never worked in practice. If you look at the
| main IETF protocols the most glaring fact is that they have existed
| unchanged for at least a decade in most cases even when there are major
| performance, scalability, reliability issues. The current state of SMTP is a
| witness to this failure.

Many of the protocols can endure, and even genuinely need, extensibility.
SMTP is certainly one of them.  But I believe SPF is not.


| The subtler problem is that you end up centralizing change control. The
| Internet was meant to be a grassroots effort but to add a feature to a
| protocol you end up having to go through choke points such as the IETF. The
| effect is that innovation is stiffled rather than encouraged.

That's one view of it.  But the internet worked very well because of the
fact that you didn't end up with so many different players all trying their
own ideas beyond merely doing some tests.  Standards are what it is about.

Imagine if every electric power company decided to supply electricity in
any voltage or any AC frequency they chose?  Well, one time that did exist.
Remnants of that still exist in a few countries, such as Japan where half
the country is running on 50 Hz and the other half on 60 Hz.

Imagine if every manufacturer of electric power receptacles decided on what
the sizes and spacing of the plug blades would be.  It might make for a big
industry in adaptors and cords, but it would be a huge mess for everyone.

Imagine if every county decided what side of the road to drive on.

We work together with standards.  As an open society, everyone should be able
to propose their ideas for the standards, openly, and have them discussed.
Eventually, some decision has to be made.  It might be argued that a popular
vote of all people in the world would be more fair than having IETF members
making the decision about the standards we call protocols.  But it is for now
far more practical to do this by a standards committee.  This is commonly
done in virtually every technology and engineering idustry.  Look at all the
fine work from groups like IEEE, ANSI, ISO, NEMA, etc.  Sure, any of us could
find specific things we can show they did "wrong".  But things work _together_
because a central decision was made.  That process can just be abandoned.


| Basically the sales pitch that is being given here is for 
| 
| 
| The Readability & Size Issue
| ----------------------------
| 
| Everyone accepts that XML is bloated size wise. Given any data structure you
| will be able to encode it more efficiently in a different encoding -
| S-Expressions, RFC822 style, you name it.
| 
| The real issue for readability though is what happens when you have a large
| data structure rather than a small one. Sure the SPF records are pretty
| simple when all we are doing is describing the IP based authentication
| scheme. What if we want to go beyond that capability? For example:
| 
|       * MTA Mark like description of IP address use 
|       * Accreditation schemes
|       * Domain key type authentication

And how is adding these things to the data format going to make them work in
SPF aware implementations?  These may be fine ideas, but they have to be
implemented AND deployed.  And SPF is not easy in that area because it is
asking multitudes of MTAs to be able to carry out a policy of someone else.
The policy space needs to be finite.  Sure, new ideas will come, and many
will be of great value and need to be used.

What I suggest instead is to go with SPF as it stands now, but develop some
new facility like it that rides along with related new XML based facilities.
If you do DNS via XML, I'd expect SPF to ride in there with it as an extension,
and then itself be extended.


| In the research note I descibe how these can be brought into the SPF syntax.
| The result is something of a compromise because I have to make sure that I
| do not exceed the complexity threshold that the SPF syntax is capable of.

Have you considered how you might move the decision making process back to
the DNS server using the SPF "exists" mechanism.  Once you're on your own
DNS server then you can do anything, any way you want to.  You can have all
your data stored in XML, of course.


| It is a heck of a lot easier to provide a description of the accreditation
| policy of an accreditation provider in XML than in SPF style. The big plus
| is that you have parentheses and you can create nested structures.

But this is going well beyond what SPF is for.  This sounds like a whole
new thing.


| From a tools point of view XML wins, there are tools to parse XML available
| in Perl, Python, pretty much any modern program environment. The industry
| has adopted XML as its core mechanism for representing data structures. You
| may think it is yukky, but there is a fundamental level where being part of
| the standard is worth more than the cost of yuk.

But sadly, it functions still very poorly.  Perhaps this is because of poor
implementations.


| This has a knock on effect, if we want to get Eric Allman on board on the
| syntax issue for example. If we say SPF syntax he can kibitz, if we say its
| XML he will say 'oh sh*t, why?' then he will hold his nose and accept his
| helping of yuk.

You're just forcing it down his throat?  Is that it?


| We did it with HTML
| --------------------
| 
| With the Web we got people to use angle brackets and the whole XML gubbins.
| And that was in a day when there were no HTML tools at all, not even an
| emacs editing mode.

But HTML was using SGML system the way it was intended, as a means to add meta
data to a document.  The document contents itself were not actually part of
the HTML; only the tags were.  We then called it all HTML, but it was a bunch
of text with HTML tags inserted to mean something related to that text.

But sending data records via XML is a whole different story.  It was not
even intended for that.  Another format, HDF, was designed for that purpose
back in the 1970's (maybe it was in the 1960's, but I first saw it around
1974) and implemented in PL/1.  But there just wasn't enough "momentum" at
the time to get it used by anything.  Data exchanges just did not exist then.


| There is no real difficulty generating XML syntax or shipping it out through
| DNS, it is as easy to edit directly or to code with a wizard. The cost is on
| the recipient side. 

HTML wasn't that complex.  XML has gotten to the point where a reference
book with every possible tag simply would be too huge to carry (and I mean
in every DTD because that is what all of XML is all about).


| Its the way DNS is going to go.
| -------------------------------
| 
| DNS is not XML based, but if there is a next generation in the next 20 years
| it will be XML based. XML is simply the 2000's version of ASCII. It may be
| flawed (missing characters) but it is going to win.

Yeah, go ahead and replace ASCII with XML.  I dare ya :-)  Now what are you
going to write your text in.

Actually, ASCII is a binary format.  It is a bunch of bits of a specific size
that have specific meanings.  And there's another good example where we need
to have a throttle on just letting anyone change the character codes any time
they want to.


| At some point Linux will ship with every configuration file in the whole
| system converted into XML. Windows already went that way.

That might explain why I have increasingly more troubles with later and
later versions of Windows.  I even had to back off of XP and go back to
2K for reasons of it constantly getting things totally confused.  It never
did understand that a .vsd file retrieved from a web site was to be opened
by Visio (though if I transferred it to Linux and back again, it worked).
Every Microsoft expert I talked to about it lost more hair over this one.

XML configuration of a running system is still problematic.  For example
there is a point before which libraries are unavailable.  You have to have
a configuration that describes how and where to find the libraries.  Also
you have to configure the network before you use it, but if the XML configs
need some remote DTD, you're screwed.

Now applications I can see being configured with XML.  But the core system
will have big problems trying to retrofit that.  The thing I observe is
that the bulk of XML promoters are applications people who could not code
a boot loader if bringing their system up depended on it (well, it does).
Are we going to have the partition table in XML next?  What about all the
BIOS data?  We might as well have XML CPUs next.



| The politics
| ------------
| 
| Unfortunately I can't explain the politics here, I am under NDA with pretty
| much every company you could imagine to be involved. Just to back up what
| Meng said, this is not something that is motivated by some engineer riding a
| hobbyhorse.

I fear what the internet would have been like had such groups been the ones
to design it.  There are perhaps many things going on behind the scenes.
Some companies want to control our lives.  Some just want to take all the
ideas and lay claim to it somehow just to get rich.  But the internet did
not happen that way, and I think that is how it fundamentally succeeded.
The very fact that there is an NDA tells me immediately there is something
wrong.  Why would secrecy be needed?  It's not the development of an open
standard going on here.


| The real issue is whether we just solve the spam issue or whether we start
| to dig ourselves out of the larger problem that the Internet protocols have
| no mechanism for extension worth a damn. We have a major major player who is
| willing to come on side with SPF if we are willing to work on the larger
| problem. If they do, they have the clout to bring onside every other player
| of any consequence at all. We can have 50% of email described by SPF records
| in weeks.

If you want to convince people that "a major major player" on board is a good
thing, I think you're going to have to convince us that the "yuck" is worth
it by saying who, and what the plan is.  It may well be worth it, or maybe
not.  But not knowing means we can do no more than speculate, and that leaves
open anything; people will just assume the worst.

If the "major player" is Microsoft, their track record itself becomes the big
yuck factor exceeding even XML itself.  I'd adopt XML if it meant keeping
Microsoft _out_ of everything.  More likely I suspect this is IBM, which
while it is doing some fantastic Linux work, is still not everyone's favorite
company (but is certainly on their side in that SCO battle).  I do know IBM
believes in "XML everything".  I bet they really would like to replace ASCII
with XML.  The one thing I do know XML can do well is sell more hardware.

Sorry, but I cannot buy the car without a test drive.  If you tell me the car
is the finest in the world, and is well worth a few extra bucks you are asking
me to pay for it, but won't even let me see it, much less drive it, the why
should I buy it?  I won't.

I realize if you're under NDA to not talk, then you're stuck.  But if you
want to be in a position to say to people that having this "major player"
on board for something is a good thing, then you'll have to "sell the car"
the good old fashion way; let us kick the tires.  You'll have to convince
"them" that it is better to come out of the shadows (since you can't do
that for them, given the NDA).

-- 
-----------------------------------------------------------------------------
| Phil Howard KA9WGN       | http://linuxhomepage.com/      http://ham.org/ |
| (first name) at ipal.net | http://phil.ipal.org/   http://ka9wgn.ham.org/ |
-----------------------------------------------------------------------------

-------
Sender Permitted From: http://spf.pobox.com/
Archives at http://archives.listbox.com/spf-discuss/current/
Latest draft at http://spf.pobox.com/draft-mengwong-spf-02.9.4.txt
To unsubscribe, change your address, or temporarily deactivate your 
subscription, 
please go to 
http://v2.listbox.com/member/?listname(_at_)©#«Mo\¯HÝÜîU;±¤Ö¤Íµø?¡


<Prev in Thread] Current Thread [Next in Thread>