Negotiated Content Delivery: Maxmimizing Information

At 09:57 AM 5/8/99 -0700, Paul Hoffman / IMC wrote:

1) You have a MIME-receiving agent (mail agent, HTTP client, etc.). That 
agent knows how to dispatch based on MIME information. It knows how to 
dispatch to programs. These dipatched-to programs might display the content 
directly, or might make a decision and launch a different display program.
2) You have a generic XML-display program (XMLShow) and some programs for 
displaying particular XML types (DisplayEDI, DisplayCal, and so on).
3) Someone invents a new XML type (XMLFoo) and a special display program 
(DisplayFoo).


Good examples - except that in all cases, you omit the negotiations phase
where the two ends of the connection can decide whether transmission is
really worthwhile.  Once I've downloaded the file, sure, no problem, I can
save it as a stream and the user can poke at it manually if they want, I
can check to see if it's XML or a binary format, etc.

However, I'd really rather not be downloading material I can't use.  If I
know beforehand - from MIME types exchanges in HTTP negotiations, for
instance - that the information is in XML, I may be (depending on my
application type) more willing to take the risk.  If not, I'll be spending
a lot of time downloading junk, finding that it doesn't work, and trying to
avoiding getting that type of information the next time around.

There are two fairly obvious cases where this is important.  The first is
the automated engine case - agents and search engines, for the most part.
While both of these tools can be built such that they are limited to a
particular set of MIME types, 'generic' (oops) engines and agents also make
sense, in part because things like namespaces allow document authors to do
things like embed information of one type inside of a document that is
ostensibly of another type.  XHTML, for example, seems like a likely
container for XML islands of all kinds, and I suspect we'll see other such
container formats arriving.  Even formats that aren't 'containers' per se
may have lots of such mixed content, and agents and search engines should
at least be aware that these formats are searchable, and not compressed
binaries or other non-XML information.

The second case isn't automated, but it presents as many problems.  User
intervention in MIME types is always a pain in the neck, something most
users aren't fond of.  I'll give a new case that may be more relevant than
the usual "my browser doesn't do Shockwave".  

Suppose 'HotSync' information for a PDA - say an address book - is stored
in XML.  Not everyone's software uses an identical XML format, since people
are irritating that way and I haven't seen any move to standardize.  I'd
like to download an address book in the wrong format to my PDA, which has
an option for importing from XML. (I have to do a little manual
identification work, but that's all right.)  Normally, though, my PDA
doesn't like getting files it doesn't understand, a reasonable approach
given its limited storage capacity.  Am I going to have to go through the
'ok, download it, I know it's a weird MIME type' routine every time I want
to download an address book in a different format?  Multiply by several
million users and support costs, and the need to identify XML documents as
XML in some way becomes significant.


At 10:21 AM 5/8/99 -0700, Ned Freed wrote:

Um, well, actually, in many cases the MIME type is very relevant and is

used to

get the data delivered to the right place. While it may be nice to imagine a
world where there's a separate XML-level dispatch process, such a process
doesn't exist in any product I'm aware of and I don't see any moves

underway to

add it.


MDSAX (see http://www.jxml.com) does provide a document router that you can
use to build such processes, but you'll definitely get exceptions if you
feed it non-XML information.  Working around that is not that difficult,
but in the general case you're right - I see no movement in that direction
yet.  This would work fine in cases where you had to figure what
application/xml meant after you got it, but doesn't do anything for the
negotiated cases described above.

At 01:12 AM 5/8/99 -0700, Larry Masinter wrote:

I don't believe that "xml" fits into the guidelines for new
top level types, which seem to rest, not on the argument about
"default processing" but rather "device/gateway filtering".
At least notionally, the distinction was between text/ image/
audio/ video/ and (now) model/, with "application/" being the
catch-all for everything else. I think "xml" is, for the
most part, "application/".


Maybe it's time for some new thinking, acknowledging that MIME types are
generic type identifiers rather than clinging to their historical roots.
Using 'xml' in the mix - top level identifier or somewhere else - is a
change, but it's a change that enables a heck of a lot of tools, despite
the constant dismissals.  I realize that may be difficult to take, but I
think users will thank us in the long run for dealing with this now rather
than waiting for the traffic jam to develop.

Simon St.Laurent
XML: A Primer / Building XML Applications (June)
Sharing Bandwidth / Cookies
http://www.simonstl.com