ietf-xml-mime
[Top] [All Lists]

Re: proposed media type registration: application/voicexml+xml

2003-12-17 13:44:25

Hello Ben,

[Putting ietf-xml-mime(_at_)imc(_dot_)org back on the cc list, because
I think quite some of this discussion may make its way into
the next version of RFC 3023 in one way or another.]

At 18:17 03/12/17 +0000, ben(_at_)morrow(_dot_)me(_dot_)uk wrote:
At  4pm on 16/12/03 Martin Duerst wrote:
> I just by chance realized that you had removed the 'charset'
> parameter from the registration for application/voicexml+xml,
> and also for application/ssml+xml. I have found something similar
> in other recent registration proposals.

The usual intent when omitting the 'charset' parameter from the
registration is that the XML *must* be encoded in UTF8 or UTF16.

This is an interesting idea. Can you point to any actual
registrations where this is the case?


If this is done, then the entity will not need to include a charset
declaration in the body either, and will be universally understood
everywhere.

Yes, this is true for application/foo+xml. For text/foo+xml,
it is not true, but then I hope nobody is talking about that anyway.


I would suggest that those registrations which do not
specify a charset be updated to state that encodings other than UTF8
and UTF16 may not be used.

I'm not really sure that this helps. It would not work together
with any of the points I have brought up:

- Generic xml processors would still accept 'charset' parameters,
  even if the registration forbade it. They also would still
  accept content in other encodings (with quite some variation,
  of course), whether declared with a 'charset' parameter or
  with the encoding pseudo-attribute on an XML declaration.
- Technology such as JSP and databases would still produce
  'charset' parameters, even if the registration didn't allow it.

There may be cases where one wants a certain media type to be
restricted to UTF-8 and UTF-16 (or even in some cases only
one of them), but just saying
"no charset parameter on media type == UTF-8 or UTF-16 only"
doesn't really cut it.


Regards,    Martin.