Among other things, this means that the same document may be interpreted with a
different encoding when read from HTTP than when read from the local file
system. YUCK! This is a major interoperability problem.
Perhaps worse yet, since the default for text/xml is us-ascii and not utf-8,
this means that serving an XML document using any non-ASCII characters over
HTTP requires the author to set the charset parameter of the MIME media type.
This is non-trivial in most environments and impossible in many. According to
RFC 3023, "US-ASCII was chosen, since it is the intersection of UTF-8 and
ISO-8859-1 and since it is already used by MIME." However, this really strikes
me as insufficient justification given the major practical problems it presents
for non-ASCII documents. Is there any chance of superseding this RFC with one
that specifies UTF-8? This still isn't perfect, but it at least allows full use
of Unicode.
Interestingly, application/xml does not have this problem, at least not all of
it. In the absence of an explicit charset parameter, then application/xml falls
back to the normal heuristics for guessing the encoding of an XML document
(e.g. byte order mark, encoding declaration, etc.) There's still a problem if
the MIME charset disagrees with the document internal information, but in
practice this isn't nearly as big a problem. Maybe that's what should be done
with text/xml as well? It certainly seems to be what Mozilla is already doing.
In the meantime, I think I'm going to start recommending the use of
application/xml and deprecating the use of text/xml.
--
+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo(_at_)metalab(_dot_)unc(_dot_)edu |
Writer/Programmer |
+-----------------------+------------------------+-------------------+
| The XML Bible, 2nd Edition (Hungry Minds, 2001) |
| http://www.ibiblio.org/xml/books/bible2/ |
| http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/ |
+----------------------------------+---------------------------------+
| Read Cafe au Lait for Java News: http://www.cafeaulait.org/ |
| Read Cafe con Leche for XML News: http://www.ibiblio.org/xml/ |
+----------------------------------+---------------------------------+