(with half a Gen-art hat on...)
Does the WG really want to revisit the anguished discussions that resulted in
the changes to Section 8.1 of draft-ietf-json-rfc4627bis between versions 07
and 08 back in late November 2013?
See https://www.ietf.org/mail-archive/web/json/current/msg02053.html and many,
many messages beore this.
Cheers,Elwyn
Sent from Samsung tablet.
-------- Original message --------From: Peter Cordell
<petejson(_at_)codalogic(_dot_)com> Date: 12/03/2017 09:06 (GMT+00:00) To:
Ned Freed <ned(_dot_)freed(_at_)mrochek(_dot_)com>, Julian Reschke
<julian(_dot_)reschke(_at_)gmx(_dot_)de> Cc:
draft-ietf-jsonbis-rfc7159bis(_dot_)all(_at_)ietf(_dot_)org, John Cowan
<cowan(_at_)ccil(_dot_)org>, ietf(_at_)ietf(_dot_)org,
secdir(_at_)ietf(_dot_)org, json(_at_)ietf(_dot_)org, Benjamin Kaduk
<kaduk(_at_)mit(_dot_)edu> Subject: Re: [Json] secdir review of
draft-ietf-jsonbis-rfc7159bis-03
On 11/03/2017 15:41, Ned Freed wrote:
On 2017-03-11 03:08, John Cowan wrote:
On Thu, Mar 9, 2017 at 12:53 AM, Benjamin Kaduk <kaduk(_at_)mit(_dot_)edu
<mailto:kaduk(_at_)mit(_dot_)edu>> wrote:
If that's what's supposed to happen, it should probably be more
clear, yes. (But aren't there texts that have valid
interpretations
in multiple encodings?)
Not if the content is well-formed JSON and the only possible encodings
are UTF-8, UTF-16, and UTF-32. It suffices to examine the first four
bytes of the input. If there are no NUL bytes in the first four bytes,
it is UTF-8; if there are two NUL bytes, it is UTF-16; if there are
three NUL bytes, it is UTF-32. This works because the grammar requires
the first character to be in the ASCII repertoire, and the NUL
*character* (U+0000) is not allowed at all.
Good explanation. Maybe the spec should include it.
+1
This exact issue just came up in a media type review, where someone
specified a charset parameter because they weren't aware of this algorithm.
It would be very helpful to have this text in the RFC.
Although it does need slightly more detail to take into account
endian-ness in the case of UTF-16 and -32.
The XML spec may offer some example text:
https://www.w3.org/TR/2008/REC-xml-20081126/#sec-guessing
Pete Cordell
Codalogic Ltd
Read & write XML in C++, http://www.xml2cpp.com