ietf
[Top] [All Lists]

Re: Gen-ART review of draft-ietf-eai-rfc5335bis-12

2011-10-23 11:47:27


--On Sunday, October 23, 2011 07:11 +0100 Dave CROCKER
<dhc(_at_)dcrocker(_dot_)net> wrote:


Remember, in UTF-8, characters can be multiple octets. So 998
UTF-8 encoded *characters* are likely to be many more than
998 octets long. So the change is to say that the limit is in
octets, not in characters.


The switch in vocabulary is clearly subtle for readers.  (I
missed it too.)

I suggest adding some language that highlights the point,
possibly the same language as you just used to explain it.

In addition to what might be useful/ necessary for readers of
5335bis, in retrospect, we ought to have a prominent comment in
one of the more generic i18n documents that highlights the fact
that the, once one moves beyond ASCII, length-in-characters and
length-in-octets, can no longer be assumed to be the same.  When
one is actually talking about storage length,
length-in-characters should be prohibited from our vocabulary
going forward.  That would actually make an interesting
extension to a nits-checker if someone could figure out how to
do it or, at least, a flag to the RFC Editor about something
they should watch out for.

    john


_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf