ietf-xml-mime
[Top] [All Lists]

Re: proposed media type registration: application/voicexml+xml

2003-12-16 14:34:20

Hello Max, others,

I just by chance realized that you had removed the 'charset'
parameter from the registration for application/voicexml+xml,
and also for application/ssml+xml. I have found something similar
in other recent registration proposals.

Here is why I think this is really not a good idea:

First, we start to get into a patchwork where some types
allow the 'charset', and others don't. Assuming that there
is something like generic XML processors (e.g. parsers)
(which is the whole point of XML), how should such a parser
know whether the 'charset' parameter is allowed or not,
and keep up with new registrations?

Second, there are various scenarios a charset parameter in
the header is automatically generated, or where it's much
easier to generate it than to avoid it. The following are
examples:
- The classical example of transcoding (converting from one
  encoding to another). Not very frequent these days on the
  Web in general, but still used for Russian/Cyrillic encodings
  and in some mobile phone scenarios.
- Scripting technologies, for example JSP. With JSP, it is much
  more straightforward to produce output with the right encoding
  with a 'charset' parameter in the header than without. The
  reason for this is that JSP allows to produce any kind of
  output, not limited to XML, and has to know how to convert
  from the Java-internal encoding to whatever is used on the
  wire. Putting that information in the 'charset' parameter
  in the header then is straightforward; anything else has
  to be done by hand. Hopefully, excluding certain classes
  of content production technologies is not what you want.
- Databases that store content as characters rather than bytes,
  in a single encoding (in many cases e.g. uniformly UTF-8),
  and transcode on output. Again, if they use generic technology,
  getting the 'charset' into the header is much more straightforward
  than putting it into the body.

Given all these cases, I think it's not at all appropriate to
remove the 'charset' parameter from the registration, because
it would severely limit the use of technology that in good
faith, and with good reasons, is using it.

In the long run, I think that an update to RFC 3023 should address
these issues in more detail to help content producers understand the
advantages and problems related to charset/encoding information.

Regards,    Martin.


At 17:15 03/08/21 +0200, Max Froumentin wrote:

Hi,

Please consider the attached Internet Draft submission: "The
application/voicexml+xml Media Type" (originating from the Voice
Browser Working Group of the W3C), for review.

Cheers,

Max Froumentin, W3C