ietf
[Top] [All Lists]

Re: [Json] BOMs

2013-11-21 13:38:30
* John Cowan wrote:
Bjoern Hoehrmann scripsit:

Is there any chance, by the way, to change `JSON.stringify` so it does
not output strings that cannot be encoded using UTF-8? Specifically,

  JSON.stringify(JSON.parse("\"\uD800\""))

would need to escape the surrogate instead of emitting it literally.

No, there isn't.  We've been down this road repeatedly.  People can and
do use JSON strings to encode arbitrary sequences of unsigned 16-bit integers.

The output of JSON.stringify("\uD800") contains no backslash character,
if you call `utf8_encode(JSON.stringify("\uD800"))` you get an exception
because UTF-8 cannot encode the lone surrogate and `utf8_encode` does
not know it could encode it as `\uD800` without loss of information. If
`JSON.stringify` produced an escape sequence instead, there would be no
problem passing the output to `utf8_encode`.
-- 
Björn Höhrmann · mailto:bjoern(_at_)hoehrmann(_dot_)de · 
http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 

<Prev in Thread] Current Thread [Next in Thread>