ietf
[Top] [All Lists]

Re: Troubles with UTF-8

2006-01-02 00:43:08
On Dec 28, 2005, at 5:05 AM, Masataka Ohta wrote:

That problem is that Unicode is stateful with complex and
indefinitely long term states

Has this ever caused a real problem to a real programmer in real life?

I have written a whole bunch of mission-critical code that reads and generates UTF-8, and any correct implementation will have to deal with the fact that there is no necessary connection between the number of glyphs on the screen and bytes in its encoding. It would be perfectly reasonable for an implementation to declare a limitation, for example that it will not process than 32 trailing modifiers on any character, and this would not cause problems in production because sequences of such a length do not occur in the encoding of any known text.

Which is to say, Ohta's statement about statefulness is true, but the conclusion that this is a "problem" is erroneous. -Tim



_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf

<Prev in Thread] Current Thread [Next in Thread>