ietf
[Top] [All Lists]

RE: [Json] Gen-ART and OPS-Dir review of draft-ietf-json-text-sequence-10

2014-12-11 16:51:39
Abstract:

  This document describes the JSON text sequence format and associated
  media type, "application/json-seq".  A JSON text sequence consists of
  any number of JSON texts, each prefix by an Record Separator
  (U+001E), and each ending with a newline character (U+000A).

"any number of JSON texts" -> "any number of UTF-8 encoded JSON texts"

This change concerns me, because it sounds like a JSON text sequence could 
consist of JSON texts encoded in UTF-8 and other encodings. I would instead 
prefer "any number of JSON texts, all encoded in UTF-8,".

It also looks like ASCII names for RS and LF are being mixed w/Unicode 
codepoints in the second sentence in the abstract.  I'm not sure 
that's a good thing to do, especially as the body of the draft refers 
to RS and LF as being ASCII.  Here are a couple of changes that would remedy 
this:

  "an Record Separator (U+001E)" -> "an ASCII Record Separator (0x1E)"
  "a newline character (U+000A)" -> "an ASCII newline character (0x0A)"

With John Cowan's change ("an ASCII Line Feed character (0x1E)" instead of "an 
ASCII Record Separator (0x1E)"), that would indeed be clearer.


Please no. That would give an even worse mix of UTF-8 and ASCII, bytes and 
characters, in the 1 sentence.

  ".. any number of JSON texts, all encoded in UTF-8, each prefixed by an ASCII 
Record Separator (0x1E) .."

How about:

  "A JSON text sequence consists of any number of JSON texts,
   each prefixed by a Record Separator (U+001E) character, and
   each suffixed by an End of Line (U+000A) character. It is
   UTF-8 encoded."

Say "Information Separator Two (U+001E)" if you really want to be pure.

Mention in the body that "Record Separator" and "Information Separator Two" are 
the ASCII and Unicode names for the same character (as are "Line Feed" and "End 
of Line"), which is why RS and LF are used as ABNF names.

P.S. The spec still defines the same ABNF names twice (RS, JSON-sequence): once 
as bytes; once as Unicode scalars. Yuck. Just give them different names.

--
James Manger


<Prev in Thread] Current Thread [Next in Thread>