Hello Max, others,
I just by chance realized that you had removed the 'charset'
parameter from the registration for application/voicexml+xml,
and also for application/ssml+xml. I have found something similar
in other recent registration proposals.
Here is why I think this is really not a good idea:
First, we start to get into a patchwork where some types
allow the 'charset', and others don't. Assuming that there
is something like generic XML processors (e.g. parsers)
(which is the whole point of XML), how should such a parser
know whether the 'charset' parameter is allowed or not,
and keep up with new registrations?
Second, there are various scenarios a charset parameter in
the header is automatically generated, or where it's much
easier to generate it than to avoid it. The following are
examples:
- The classical example of transcoding (converting from one
encoding to another). Not very frequent these days on the
Web in general, but still used for Russian/Cyrillic encodings
and in some mobile phone scenarios.
- Scripting technologies, for example JSP. With JSP, it is much
more straightforward to produce output with the right encoding
with a 'charset' parameter in the header than without. The
reason for this is that JSP allows to produce any kind of
output, not limited to XML, and has to know how to convert
from the Java-internal encoding to whatever is used on the
wire. Putting that information in the 'charset' parameter
in the header then is straightforward; anything else has
to be done by hand. Hopefully, excluding certain classes
of content production technologies is not what you want.
- Databases that store content as characters rather than bytes,
in a single encoding (in many cases e.g. uniformly UTF-8),
and transcode on output. Again, if they use generic technology,
getting the 'charset' into the header is much more straightforward
than putting it into the body.
Given all these cases, I think it's not at all appropriate to
remove the 'charset' parameter from the registration, because
it would severely limit the use of technology that in good
faith, and with good reasons, is using it.
In the long run, I think that an update to RFC 3023 should address
these issues in more detail to help content producers understand the
advantages and problems related to charset/encoding information.
Regards, Martin.
At 17:15 03/08/21 +0200, Max Froumentin wrote:
Hi,
Please consider the attached Internet Draft submission: "The
application/voicexml+xml Media Type" (originating from the Voice
Browser Working Group of the W3C), for review.
Cheers,
Max Froumentin, W3C