On Mon, Dec 08, 2014 at 11:20:25PM -0500, John Cowan wrote:
Patrik Fältström scripsit:
This implies the whole thing is a UTF-8 encoded text that is to be
parsed like this:
No, this is a misunderstanding. There is no requirement that the sequence
*as a whole* is well-formed UTF-8 text. For example, if the first JSON
text is written only in part for whatever reason (system crash, etc.) to
a log file, the next process can write a 0x1E byte and carry on.
Correct. But we're splitting hairs: arbitrary octet strings (but for
0x1E) can be attempted to be parsed, though only those that are valid
JSON texts encoded in UTF-8 should be accepted.
The encoder, of course, must produce UTF-8.
Nico
--