ietf-xml-mime
[Top] [All Lists]

RE: proposed media type registration: application/voicexml+xml

2003-12-16 14:41:32

Seconded, for all the reasons given by Martin. Please restore the charset
parameter.

Regards,

-- 
François

-----Message d'origine-----
De : Martin Duerst [mailto:duerst(_at_)w3(_dot_)org]
Envoyé : 16 décembre 2003 16:33
À : Max Froumentin; ietf-types(_at_)iana(_dot_)org
Cc : w3c-archive(_at_)w3(_dot_)org; ietf-xml-mime(_at_)imc(_dot_)org; Ben 
Kovitz; Linus
Walleij
Objet : Re: proposed media type registration: application/voicexml+xml



Hello Max, others,

I just by chance realized that you had removed the 'charset'
parameter from the registration for application/voicexml+xml,
and also for application/ssml+xml. I have found something similar
in other recent registration proposals.

Here is why I think this is really not a good idea:

First, we start to get into a patchwork where some types
allow the 'charset', and others don't. Assuming that there
is something like generic XML processors (e.g. parsers)
(which is the whole point of XML), how should such a parser
know whether the 'charset' parameter is allowed or not,
and keep up with new registrations?

Second, there are various scenarios a charset parameter in
the header is automatically generated, or where it's much
easier to generate it than to avoid it. The following are
examples:
- The classical example of transcoding (converting from one
   encoding to another). Not very frequent these days on the
   Web in general, but still used for Russian/Cyrillic encodings
   and in some mobile phone scenarios.
- Scripting technologies, for example JSP. With JSP, it is much
   more straightforward to produce output with the right encoding
   with a 'charset' parameter in the header than without. The
   reason for this is that JSP allows to produce any kind of
   output, not limited to XML, and has to know how to convert
   from the Java-internal encoding to whatever is used on the
   wire. Putting that information in the 'charset' parameter
   in the header then is straightforward; anything else has
   to be done by hand. Hopefully, excluding certain classes
   of content production technologies is not what you want.
- Databases that store content as characters rather than bytes,
   in a single encoding (in many cases e.g. uniformly UTF-8),
   and transcode on output. Again, if they use generic technology,
   getting the 'charset' into the header is much more straightforward
   than putting it into the body.

Given all these cases, I think it's not at all appropriate to
remove the 'charset' parameter from the registration, because
it would severely limit the use of technology that in good
faith, and with good reasons, is using it.

In the long run, I think that an update to RFC 3023 should address
these issues in more detail to help content producers understand the
advantages and problems related to charset/encoding information.

Regards,    Martin.


At 17:15 03/08/21 +0200, Max Froumentin wrote:

Hi,

Please consider the attached Internet Draft submission: "The
application/voicexml+xml Media Type" (originating from the Voice
Browser Working Group of the W3C), for review.

Cheers,

Max Froumentin, W3C