ietf-xml-mime
[Top] [All Lists]

Proposed Suffix for XML-based media types

1999-07-19 09:55:22
[This is a more formal expansion of my earlier 'modest proposal'
(http://www.imc.org/ietf-xml-mime/mail-archive/msg00149.html), which seemed
to generate some interest.  This is an extremely rough draft and has no
official status of any kind.  All comments, revisions, improved
descriptions, better examples, outrage, etc. are welcome.]

-------------------------------------

MIME Media Type Suffixes for XML-based Document Types

1. ABSTRACT

This document describes the use of a naming convention (a suffix of '-xml')
for identifying XML-based MIME media types, whatever their particular
contents may represent.  This allows the use of generic XML processors and
technologies on a wide variety of different XML document types at a minimum
cost, using existing frameworks for media type registration.

2. INTRODUCTION

Extensible Markup Language [XML] provides a generic syntax for structured
text documents which can be used to define a wide variety of specific
document types.  RFC 2376 [XML-Media] described two MIME Media Types
(text/xml and application/xml) which can be used with 'generic' XML.  As
XML development continues to develop, new XML document types are appearing
rapidly. Many of these XML document types would benefit from the
identification possibilities of a more specific MIME media type than
text/xml or application/xml can provide, and it is likely that many new
media types for XML-based document types will be registered in the near and
ongoing future.

While the benefits of specific MIME types for particular types of XML
documents are significant, all XML documents share common structures and
syntax that make possible common processing.   XML's supporting standards,
notably XPointer, are designed to work with any XML document, whatever
vocabulary it uses. Any XML editor should be able to read, modify, and save
any XML document.  Search engines and agents should be able to read XML
documents and extract the content and names of elements and attributes even
if they are ignorant of the particular vocabulary used for elements and
attributes.  XML-oriented storage systems, which keep XML documents
internally in a parsed form, should similarly be able to process, store,
and recreate any XML document, whatever the MIME media type of that XML
document may be.  

Combining the benefits of more specific typing of documents and generic XML
processing requires generic XML applications to adopt one or more of three
approaches:

1) Use a trial-and-error approach that involves downloading documents and
checking their contents to determine whether or not they are XML documents.

2) Keep track of all MIME media types, and know which types represent XML
documents and which do not.

3) Recognize a convention that allows MIME media types to describe specific
types of documents created with XML while still identifying that the base
syntax is XML.

The first approach wastes network and processing resources every time a new
media type is encountered.  The second is more efficient, but requires
manual updating (at present) and may involve fallback to approach number 1.
 The third approach gives applications a consistent labelling standard that
allows for the automatic addition of new media types to generic
applications without disruption or wasted connections.

This proposal does not address more complex content negotiation issues that
may be necessary for applications to determine whether the recipient
understands the particular vocabulary sets used within given XML document
types.  The media type described here could be used by applications to
prepare for such negotiation, but the media type only provides a monolithic
description of a document type.

3. NEW XML MEDIA TYPES

When a new media type is introduced for an XML-based format, the name of
the media type should end with "-xml".  This will allow applications that
can process XML generically to detect that the file is supposed to be an
XML document as described in [XML] and process it accordingly.  The
registration process for these media types is described in RFC 2048
[MIME-Registration].  The registrar for the IETF tree will enforce this
rule for all XML-based media types created in the IETF tree.  Registrars
for other trees should follow this convention in order to ensure maximum
interoperability of their XML-based documents.

The optional charset parameter may be used with media types following these
conventions as described in RFC 2376 or its successors.

4. PROCESSING XML MEDIA TYPES

Two general classes of applications will operate on XML documents.  The
first class operates on specific types of documents indicated by the
complete MIME media type. For example, a display applet that renders
Scalable Vector Graphics [SVG] will only be interested in XML documents of
type 'image/svg-xml', not documents of type 'image/gif' or
'application/xpdl-xml'.  

Other ('generic') applications may work on any XML documents they
encounter, processing any media type that matches the type '*/*xml'. This
will match text/xml and application/xml as well as more specific media
types ending in -xml, allowing the application to request and process XML
documents represented by any XML-specific media type.  While these
applications may not understand the semantic meaning of particular
vocabularies, they will be able to process the information stored in the
document, determine its structure, and present it to a human or machine
consumer in some form.

5. EXAMPLES

XML-based media types may represent information in any of the discrete
top-level MIME types described in RFC 2046 [MIME-Types]. The examples below
present possible names for XML-based media types using the notation
described about.

5.1 text types

If a standard XML-based format for memorandums emerged, an appropriate
media type might be:

text/memo-xml

5.2 image types

If the World Wide Web Consortium (W3C) were to register a media type for
its XML-based Scalable Vector Graphics [SVG], an appropriate media type
might be:

image/svg-xml

5.3 audio types

While multimedia applications are not considered a likely area for XML
development, XML could be used to provide a metadata wrapper around other
audio formats or could store some types of audio information directly.  An
XML-based format for music called MusicML could be registered as:

audio/musicml-xml

5.4 video types

While multimedia applications are not considered a likely area for XML
development, XML could be used to provide a metadata wrapper around other
video formats or could store some types of video information directly.  An
XML-based format for video called VideoML could be registered as:

video/videoml-xml

5.5 application types

If XML Processing Description Language (XPDL) were to be registered as a
media type, it could be registered as:

application/xpdl-xml

5.6 model types

XML is also suitable for a wide variety of modeling applications.  If, for
instance, a Structured Programming Modeling Markup Language were to appear,
it could be registered as:

model/spmml-xml

6. REFERENCES

XML     Bray, Tim, Paoli, Jean, and Sperberg-McQueen, C.M.  Extensible Markup
Language (XML) 1.0. (W3C Recommendation 10-February-1998)
http://www.w3.org/TR/REC-xml

XML-Media       Whitehead, E. and Murata, M.  XML Media Types. (RFC 2376, July
1998) http://www.rfc-editor.org/rfc/rfc2376.txt

MIME-Registration       Freed, N., Kleinsin, J., and Postel, J.  Multipurpose
Internet Mail Extensions (MIME) Part Four: Registration Procedures. (RFC
2048, November 1996.)  http://www.rfc-editor.org/rfc/rfc2048.txt

MIME-Types      Freed, N., and Borenstein, N.  Multipurpose Internet Mail
Extensions (MIME) Part Two: Media Types. (RFC 2046, November 1996.)
http://www.rfc-editor.org/rfc/rfc2046.txt

SVG     Ferraiolo, Jon, et al.  Scalable Vector Graphics. (W3C Working Draft,
06 July 1999.)  http://www.w3.org/TR/SVG/

Simon St.Laurent
XML: A Primer / Building XML Applications
Inside XML DTDs: Scientific and Technical
Sharing Bandwidth / Cookies
http://www.simonstl.com