ietf-xml-mime
[Top] [All Lists]

Re: Requesting a revision of RFC3023

2003-09-17 10:38:44

MURATA Makoto wrote:

First, Simon and I were asked by the W3C team not to take any action
on RFC 3023.  This is because the MIME type registration procedure was
expected to change (see [1] and [2]). So, Simon, Dan, and I can't do anything right now.

Hmm, the TAG is pretty convinced that 3023 needs to change, so maybe Dan or Chris or TimBL could take this up internally. I disagree that this should be frozen at the moment, since the TAG is quite likely to publish a document saying "RFC 3023 is wrong".

As for the charset parameter, I am still uneasy to disallow or
deprecate it.  But I agree to make "clear that nobody sending a
media-type should send a charset for an XML media-type unless it
REALLY REALLY KNOWS what it's sending," and to deprecate text/xml not
because the charset parameter is harmful but because most XML is not
text for casual users.

I think I provided a detailed explanation of why the charset is in fact actively harmful in the context of XML. If you're not convinced it would be helpful if you could address those points. If you already have, my apologies, perhaps you could give a pointer.

I have repeatedly asked (e.g., [3]) what is the position of the TAG on
charset detection for non-XML formats.  The latest version of the TAG
finding document "Client handling of MIME headers" appears to
recommend:

I read [3] and while I agree with much of it, it's obviously far too late to change the XML encoding declaration. For the moment, I think that the architecturally-sound position is, for Web data formats, either (a) use XML, or (b) use the charset parameter. I'm generally in favor of a general-purpose encoding-detection scheme such as you propose, but I'm pessimistic about getting it widely deployed for legacy formats.

        (1) non-self-describing data formats should rely on the
            charset parameter, and
        (2) self-describing data formats should introduce their own
            mechanism for specifying charsets.

I'll review the webarch doc, I suspect we haven't thought closely enough about this.

As far as I know, the charset parameter is the only generic mechanism. I know the charset parameter is not working well, but I do not see any other generic mechanisms.

I agree, but for XML formats, I still think the charset parameter is actively harmful and should be deprecated or even forbidden. This is orthogonal to the larger question you (correctly) raise, of charset detection for non-XML formats.

--
Cheers, Tim Bray
        (ongoing fragmented essay: http://www.tbray.org/ongoing/)