ietf-xml-mime
[Top] [All Lists]

Re: Internet draft for sbml+xml media type

2003-10-14 13:50:54

Regarding: 
 
http://www.ietf.org/internet-drafts/draft-sbml-media-type-00.txt  
 
Chris Lilley wrote: 
 
Do the existing SBML-consuming tools honor the charset 
parameter when SBML is sent over HTTP or email?  And where the 
charset parameter disagrees with the encoding in the XMl 
encoding declaration, do these tools rewrite the XML when 
saving it to disk? 
 
Note that, although charset is optional in this registration, 
RFC 3023 still imposes requirements on software in the 
*absence* of a charset. 
 
If not, I suggest removing this optional parameter as follows: 
 
   There is no charset parameter. Character handling has 
   identical semantics to the case where the charset parameter 
   of the "application/xml" media type is omitted, as 
   described in [RFC3023]. 
 
My understanding is that most current SBML-consuming tools do 
not honor the charset parameter, but that's only because most 
don't get involved with MIME.  Thanks for pointing out the RFC 
3023 statements regarding the absence of a parameter!  Indeed 
us-ascii seems unnecessarily limiting. 
 
Andrew Finney on sbml-discuss at caltech.edu responded: 
 
  The SBML Level 2 document states in section 4.1: 
   
    "The character encoding for SBML is UTF-8. SBML documents 
    should include the encoding attribute with the value UTF-8 
    in the XML prologue." 
   
  SBML Level 1 doesn't have this text as far as I remember. 
   
  I don't know if that helps. 
   
  My vote would be to specify that SBML is always encoded in 
  UTF-8 and follow through with that in the MIME registration. 
 
  Would that make SBML unusual for an XML format?  Do MIME 
  processors need the flexibility to transform UTF-8 into 
  us-ASCII? i.e. what are the concrete implications? if we do 
  this are we imposing an unrealistic restriction on 
  transmission that would probably not happen in practice? 
   
  Can we make the char set non-optional and restricted to UFT-8? 
  If yes this would then avoid the issue with rewriting to disk. 
  If we restrict the car set does that make interpreting the 
  incoming stream because the its in only char set? 
 
Any thoughts?  Restricting sbml+xml to a UTF-8 encoding would 
certainly be simple, imposing no new requirements on 
SBML-consuming tools.  But would this go against the grain of 
the relevant RFCs and future evolvability? 
 
 
Ben Kovitz 
Caltech 
bkovitz at caltech.edu 
http://sbml.org