ietf
[Top] [All Lists]

Re: [secdir] [Json] secdir review of draft-ietf-jsonbis-rfc7159bis-03

2017-03-13 13:16:01
On Mon, Mar 13, 2017 at 09:14:16AM +0100, Julian Reschke wrote:
So the changes in RFC 7159 allow top-level strings, so we can't rely on the
first *two* characters being US-ASCII. But we *can* rely on the first one
being US-ASCII, no?

Correct.

If one OR two bytes of the first four are NULs, then the encoding is
UTF-16 (or something else or invalid):

So the following should still be correct:

  Since the first character of a JSON text will always be an ASCII
  character [RFC0020], it is possible to determine whether an octet
  stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
  at the pattern of nulls in the first four octets.

          00 00 00 xx  UTF-32BE
          00 xx xx xx  UTF-16BE
          xx 00 00 00  UTF-32LE
          xx 00 xx xx  UTF-16LE
          xx xx xx xx  UTF-8

Count the number of NULs in the first four bytes:

 - if zero -> UTF-8
 - if one or two -> UTF-16
 - if three -> UTF-32

Nico
--