Regarding:
http://www.ietf.org/internet-drafts/draft-sbml-media-type-00.txt
Chris Lilley wrote:
Do the existing SBML-consuming tools honor the charset
parameter when SBML is sent over HTTP or email? And where the
charset parameter disagrees with the encoding in the XMl
encoding declaration, do these tools rewrite the XML when
saving it to disk?
Note that, although charset is optional in this registration,
RFC 3023 still imposes requirements on software in the
*absence* of a charset.
If not, I suggest removing this optional parameter as follows:
There is no charset parameter. Character handling has
identical semantics to the case where the charset parameter
of the "application/xml" media type is omitted, as
described in [RFC3023].
My understanding is that most current SBML-consuming tools do
not honor the charset parameter, but that's only because most
don't get involved with MIME. Thanks for pointing out the RFC
3023 statements regarding the absence of a parameter! Indeed
us-ascii seems unnecessarily limiting.
Andrew Finney on sbml-discuss at caltech.edu responded:
The SBML Level 2 document states in section 4.1:
"The character encoding for SBML is UTF-8. SBML documents
should include the encoding attribute with the value UTF-8
in the XML prologue."
SBML Level 1 doesn't have this text as far as I remember.
I don't know if that helps.
My vote would be to specify that SBML is always encoded in
UTF-8 and follow through with that in the MIME registration.
Would that make SBML unusual for an XML format? Do MIME
processors need the flexibility to transform UTF-8 into
us-ASCII? i.e. what are the concrete implications? if we do
this are we imposing an unrealistic restriction on
transmission that would probably not happen in practice?
Can we make the char set non-optional and restricted to UFT-8?
If yes this would then avoid the issue with rewriting to disk.
If we restrict the car set does that make interpreting the
incoming stream because the its in only char set?
Any thoughts? Restricting sbml+xml to a UTF-8 encoding would
certainly be simple, imposing no new requirements on
SBML-consuming tools. But would this go against the grain of
the relevant RFCs and future evolvability?
Ben Kovitz
Caltech
bkovitz at caltech.edu
http://sbml.org