On Thu, 14 Jul 2005 08:49:12 +0200, Martin Duerst <duerst(_at_)it(_dot_)aoyama(_dot_)ac(_dot_)jp>
wrote:
I agree with that reviewer that the type should not contain
'+' characters except before 'xml'. All other subtypes have used '-'
as a separator. The '+' separator was specifically introduced to
express the fact that the '+xml' part is something more than
a simple subtype.
They have used it for a entirely different purpose, as a space (word)
substitution. Examples are application/java-byte-code (Java byte code),
octet-stream (octet stream), ringing-tones (ringing tones),
x-nokia-9000-communicator-add-on-software (Nokia 9000 Communicator add-on
software).
XHTML+Voice is not "XHTML Voice". Turning the content type for XHTML+Voice
into xhtml-voice is bound to cause endless confusion. "Remember that in
the MIME type the plus turns into a minus" is exactly that kind of rule
that cause authors to be frustrated with Internet standards.
Although there is probably nothing in RFC 3023 explicitly
disallowing the use of '+' for "arbitrary use", I think that
the whole rationale for '+xml' in Appendix A of RFC 3023
(in particular http://www.ietf.org/rfc/rfc3023.txt, A.12)
seem to indicate that it shouldn't be done.
What they did was to examine if their use of '+' caused problems with the
existing specs or implementations. In particular they discounted use of
parameters though even they were a function of the spec, the existing
implementations didn't handle them properly.
This examination is something we should do too. If xhtml+voice+xml breaks
existing implementations (it doesn't break existing specs, including
RFC3023) we shouldn't do it.
But this content type has had considerable exposure for several years in
its x-xhtml+voice+xml form and we have not encountered a single problem
this far. I am not claiming we have exposed it to more than a fraction of
the complete net. But consider that no processor should automatically
apply XML processing to subtypes like e.g.
ccxml
xmlcc
xml+zz
z+xml+z
z+xmlz
z-xml
where (+)xml is in the middle of the string and not at the end, it follows
that any algorithm that breaks on xhtml+voice+xml is bound to break on
other cases too.
Furthermore what would this hypothetical break imply? Unlike parameters
(in the spec, poor implemetation record) all processors can handle media
types that contain one or more '+' (RFC 3023 is dependent on this too).
A failure would either be a false positive ("it has a '+' so it must be
XML") or a false negative ("the character following '+' is not 'x'"). A
false positive cause no harm (this format is XML), a false negative would
lead to processing of application/octet-stream which in practice is
something the agent is highly likely to handle rationally too.
--
Jonny Axelsson, Core Technology, Opera Software ASA