Yes, -11 is fine wrt my review:
http://www.ietf.org/mail-archive/web/gen-art/current/msg11048.html
Thanks,
--David
-----Original Message-----
From: Jari Arkko [mailto:jari(_dot_)arkko(_at_)piuha(_dot_)net]
Sent: Thursday, December 18, 2014 8:45 AM
To: Patrik Fältström; Black, David
Cc: John Cowan; ops-dir(_at_)ietf(_dot_)org; ietf(_at_)ietf(_dot_)org; Paul
Hoffman; Manger, James;
General Area Review Team (gen-art(_at_)ietf(_dot_)org)
Subject: Re: [Json] Gen-ART and OPS-Dir review of draft-ietf-json-text-
sequence-10
David -thank you for the review!
My understanding of this thread and the -11 is that we are done with respect
to the modifications coming out of your review. Let me know otherwise.
Thanks, all.
Jari
On 13 Dec 2014, at 02:02, Patrik Fältström <paf(_at_)frobbit(_dot_)se> wrote:
On 12 dec 2014, at 02:12, John Cowan
<cowan(_at_)mercury(_dot_)ccil(_dot_)org> wrote:
Manger, James scripsit:
How about:
"A JSON text sequence consists of any number of JSON texts,
each prefixed by a Record Separator (U+001E) character, and
each suffixed by an End of Line (U+000A) character. It is
UTF-8 encoded."
Say "Information Separator Two (U+001E)" if you really want to be pure.
The trouble with that is that U+001E has no official Unicode name or
function; those come from ISO 6429, which is incorporated (in relevant
part) into US-ASCII, which is described in RFC 20.
Although it does not have a Unicode Name, the alias is as close as we can
get, which is "INFORMATION SEPARATOR TWO":
# grep ^001E UnicodeData.txt
001E;<control>;Cc;0;B;;;;;N;INFORMATION SEPARATOR TWO;;;;
#
So I suggest to use that.
It is I think wrong to say "Record Separator" and then still reference the
Unicode Tables.
Alternatively one just write (and make it more clear how this works, and
this is my understanding):
A JSON text sequence consists of any number of JSON texts, each prefixed by
U+001E character and each suffixed by U+000A. The JSON texts as well as the
whole JSON text sequence is encoded in UTF-8 although any JSON text might be
truncated and because of that not a valid UTF-8 sequence. Any occurance of the
UTF-8 encoding of U+001E (the byte 0x1E) is to be viewed as the first byte
before each JSON text, and occurrance of the byte 0x0A is to be viewed as the
first byte after a complete JSON text. If the JSON text is truncated, the 0x0A
byte will not be present.
I.e. the grammar is sort of (before coffee in the morning):
sequence := 0x1E text
text := complete-text | truncated-text
complete-text := proper-UTF8 0x0A
truncated-text := proper-UTF8 broken-UTF8
proper-UTF8 := "" | "a sequence of bytes, possible to parse as a series of
UTF8 encoded Unicode characters"
broken-UTF8 := "a sequence of bytes not possible to parse as a UTF8 encoded
unicode character"
Patrik