Re: Registration of media type application/xhtml-voice+xml

On Thu, 14 Jul 2005 08:49:12 +0200, Martin Duerst <duerst(_at_)it(_dot_)aoyama(_dot_)ac(_dot_)jp>wrote:

I agree with that reviewer that the type should not contain
'+' characters except before 'xml'. All other subtypes have used '-'
as a separator. The '+' separator was specifically introduced to
express the fact that the '+xml' part is something more than
a simple subtype.

They have used it for a entirely different purpose, as a space (word)substitution. Examples are application/java-byte-code (Java byte code),octet-stream (octet stream), ringing-tones (ringing tones),x-nokia-9000-communicator-add-on-software (Nokia 9000 Communicator add-onsoftware).

XHTML+Voice is not "XHTML Voice". Turning the content type for XHTML+Voiceinto xhtml-voice is bound to cause endless confusion. "Remember that inthe MIME type the plus turns into a minus" is exactly that kind of rulethat cause authors to be frustrated with Internet standards.

Although there is probably nothing in RFC 3023 explicitly
disallowing the use of '+' for "arbitrary use", I think that
the whole rationale for '+xml' in Appendix A of RFC 3023
(in particular http://www.ietf.org/rfc/rfc3023.txt, A.12)
seem to indicate that it shouldn't be done.

What they did was to examine if their use of '+' caused problems with theexisting specs or implementations. In particular they discounted use ofparameters though even they were a function of the spec, the existingimplementations didn't handle them properly.

This examination is something we should do too. If xhtml+voice+xml breaksexisting implementations (it doesn't break existing specs, includingRFC3023) we shouldn't do it.

But this content type has had considerable exposure for several years inits x-xhtml+voice+xml form and we have not encountered a single problemthis far. I am not claiming we have exposed it to more than a fraction ofthe complete net. But consider that no processor should automaticallyapply XML processing to subtypes like e.g.


ccxml
xmlcc
xml+zz
z+xml+z
z+xmlz
z-xml

where (+)xml is in the middle of the string and not at the end, it followsthat any algorithm that breaks on xhtml+voice+xml is bound to break onother cases too.

Furthermore what would this hypothetical break imply? Unlike parameters(in the spec, poor implemetation record) all processors can handle mediatypes that contain one or more '+' (RFC 3023 is dependent on this too).

A failure would either be a false positive ("it has a '+' so it must beXML") or a false negative ("the character following '+' is not 'x'"). Afalse positive cause no harm (this format is XML), a false negative wouldlead to processing of application/octet-stream which in practice issomething the agent is highly likely to handle rationally too.


--
Jonny Axelsson, Core Technology, Opera Software ASA