ietf
[Top] [All Lists]

Re: [Json] BOMs

2013-11-20 10:49:00
On Tue, Nov 19, 2013 at 4:31 AM, Bjoern Hoehrmann 
<derhoermi(_at_)gmx(_dot_)net> wrote:

* Tatu Saloranta wrote:
Dominant Java implementations support UTF-16 with BOM; either directly or
through Java's Reader implementations that handle BOMs.
String concatenation case seems irrelevant, since BOMs are not included in
in-memory representation anyway, as opposed to byte stream serialization.

HTTP implementations cannot correctly determine whether an entity body
is text in a single character encoding and if so what that encoding is,
accordingly the dominant API deals in byte[] arrays, not text Strings;
furthermore, many programming languages default to byte[] arrays for
string literals. That often combines into forms of

  byte[] json = sprintf('{"x": %s, "y": %s}', GET(...), GET(...));

which works fine if all three byte[] arrays are UTF-8 encoded and use
no Unicode signature, which is the case 99% of the time.


My point was just that although it appears that many scripting languages
may not deal with BOM properly, same is not true on all platforms. Proper
JSON APIs on JVM do accept both String and byte[] based input; byte[] being
preferred since it is more efficient, and reliably with auto-detection,
assuming that -- as per JSON specification -- the only single-byte encoding
used is UTF-8.

-+ Tatu +-
<Prev in Thread] Current Thread [Next in Thread>