Re: [Json] secdir review of draft-ietf-jsonbis-rfc7159bis-03

(with half a Gen-art hat on...)
Does the WG really want to revisit the anguished discussions that resulted in 
the changes to Section 8.1 of draft-ietf-json-rfc4627bis between versions 07 
and 08 back in late November 2013?
See https://www.ietf.org/mail-archive/web/json/current/msg02053.html and many, 
many messages beore this.
Cheers,Elwyn


Sent from Samsung tablet.
-------- Original message --------From: Peter Cordell 
<petejson(_at_)codalogic(_dot_)com> Date: 12/03/2017  09:06  (GMT+00:00) To: 
Ned Freed <ned(_dot_)freed(_at_)mrochek(_dot_)com>, Julian Reschke 
<julian(_dot_)reschke(_at_)gmx(_dot_)de> Cc: 
draft-ietf-jsonbis-rfc7159bis(_dot_)all(_at_)ietf(_dot_)org, John Cowan 
<cowan(_at_)ccil(_dot_)org>, ietf(_at_)ietf(_dot_)org, 
secdir(_at_)ietf(_dot_)org, json(_at_)ietf(_dot_)org, Benjamin Kaduk 
<kaduk(_at_)mit(_dot_)edu> Subject: Re: [Json] secdir review of 
draft-ietf-jsonbis-rfc7159bis-03 
On 11/03/2017 15:41, Ned Freed wrote:

On 2017-03-11 03:08, John Cowan wrote:


On Thu, Mar 9, 2017 at 12:53 AM, Benjamin Kaduk <kaduk(_at_)mit(_dot_)edu
<mailto:kaduk(_at_)mit(_dot_)edu>> wrote:

     If that's what's supposed to happen, it should probably be more
     clear, yes.  (But aren't there texts that have valid

interpretations

     in multiple encodings?)


Not if the content is well-formed JSON and the only possible encodings
are UTF-8, UTF-16, and UTF-32.  It suffices to examine the first four
bytes of the input.  If there are no NUL bytes in the first four bytes,
it is UTF-8; if there are two NUL bytes, it is UTF-16; if there are
three NUL bytes, it is UTF-32.  This works because the grammar requires
the first character to be in the ASCII repertoire, and the NUL
*character* (U+0000) is not allowed at all.

Good explanation. Maybe the spec should include it.


+1

This exact issue just came up in a media type review, where someone
specified a charset parameter because they weren't aware of this algorithm.

It would be very helpful to have this text in the RFC.



Although it does need slightly more detail to take into account 
endian-ness in the case of UTF-16 and -32.

The XML spec may offer some example text:

https://www.w3.org/TR/2008/REC-xml-20081126/#sec-guessing

Pete Cordell
Codalogic Ltd
Read & write XML in C++, http://www.xml2cpp.com