ietf-xml-mime
[Top] [All Lists]

Re: transcoding nearly certainly wrong?

2003-09-20 12:57:53

Larry Masinter writes:

Why is transcoding nearly certain to be wrong with XML?

Can't be *that* wrong. I do it all the time...


Or, to put it another way, why not limit the use
of text/xml to XML instances for which transcoding
is certain not to be wrong, and for which US-ASCII
is acceptable (because the XML uses numeric character
references or character entities, is only used to
code a limited schema with numeric data, etc.?)

Limiting the scope is less radical than deprecating.

I agree.

Every text/foo format can be transcoded from any encoding to UTF-8.
Something I do often. Going the other way doesn't always work, of
course.

But if it doesn't work, you can look more closely at the MIME type and
use your knowledge about foo. If foo is css or html, you can convert
from any encoding to any encoding (and I do so often). If foo is
something+xml, you can still do do it, as long as all element names
and attribute names are in ASCII. Which happens to be the case with
all XML-based formats created by W3C and all formats that I work with,
fortunately. And thus my conversion program doesn't even check, but
blindly converts every non-ASCII character to &#nnn;. Very useful,
since I create non-English files by cut & paste and my Emacs only
understands Latin-1.

So, for many XML-based formats, including all W3C's formats, using
text/foo instead of application/foo makes a lot of sense. Reserve
application/something+xml for formats that cannot be transcoded.



Bert
-- 
  Bert Bos                                ( W 3 C ) http://www.w3.org/
  http://www.w3.org/people/bos/                              W3C/ERCIM
  bert(_at_)w3(_dot_)org                             2004 Rt des Lucioles / BP 
93
  +33 (0)4 92 38 76 92            06902 Sophia Antipolis Cedex, France


<Prev in Thread] Current Thread [Next in Thread>