ietf-xml-mime
[Top] [All Lists]

Re: Requesting a revision of RFC3023

2003-09-19 13:30:05

Hello Tim, others,

At 10:50 03/09/16 -0700, Tim Bray wrote:

[Some of you will get this twice, sorry; Larry Masinter pointed out that my initial choice of destinations was poor. I slightly revised the note to provide more context.]

The W3C TAG (http://www.w3.org/2001/tag/) has an open issue about proper handling of MIME headers, with a draft in progress "Client Handling of MIME Headers" (http://www.w3.org/2001/tag/doc/mime-respect.html);

Some comments on this in a separate mail.


the draft finds some fault with the contents of RFC3023.

I took an action item to ask about the chances of revising what 3023 says about the charset parameter; while I'm not sure, I suspect that there may actually be some level of consensus about the desirable changes:

1. Deprecate text/* for anything that's in XML.

Agreed, in general.


That's because it forces the provider to provide a charset header, because in its absence the receiver is required to assume either ASCII or 8859 depending on the context,

Do you mean that it is US-ASCII for email and iso-8859-1 (there are many
parts of iso 8859) for HTTP? My understanding is that RFC 3023 says
that the US-ASCII default applies for all protocols.


which has a very high probability of being wrong, which is irritating because if there were no charset header the client would be certain of either getting it right or failing deterministically.

That was part of the design of making US-ASCII the default, too,
similar to the design of XML in general: Either the charset parameter
is set (correctly), or it is missing, and then the parser has a very
easy job to detect the problem if the data is not US-ASCII.

On the other hand, there is in some cases no other way than to ask
a linguistic expert to detect a mistaken iso-8859-1 (e.g. for a correct
iso-8859-2).


And forcing the server to provide a charset= is wrong; see the next point.

2. Deprecate the charset parameter for application/xml and application/*+xml. I think that Roy Fielding would like to go far as to simply outlaw it; I'd be fine with that too.

I think moving from the current wording which is very close to required
to a more optional wording is fine. However, I'm not sure about deprecating
it (which basically means that it is a bad idea but still tolerated for some
time), and I don't think outlawing is a good idea.


The reason is that the client is almost certain to get it right, and will fail deterministically if it doesn't. For the server, on the other hand, this is easy to get wrong, particularly with the introduction of various kinds of filters in modern web servers.

It's not about client or server. It's about author or server, or
any other producer such as some software. The client (hopefully)
never does anything else than follow the spec.


And since the Web architecture and the XML spec both say that the server's claim has to be taken as authoritative, this is really highly dysfunctional. At the very least, it should be made clear that nobody sending a media-type should send a charset for an XML media-type unless it REALLY REALLY KNOWS what it's sending, and in that case should consider not sending it anyhow.

I don't disagree with this. But I wonder why this would have to be
stressed that much. It's a REALLY REALLY bad idea to create or send
non-wellformed XML, it's a REALLY REALLY bad idea to send documents
with a wrong mime type, and so on. Yet our specs don't contain
"REALLY REALLY bad" very often.


Is there any chance we could do this? It's going to be kind of embarrassing for TAG findings and the Webarch doc to be saying "don't do what this RFC says".

It would definitely be good for things to be in sync. And I'm sure
we can find an adequate compromise.


Regards,    Martin.