On Dec 28, 2005, at 5:05 AM, Masataka Ohta wrote:
That problem is that Unicode is stateful with complex and
indefinitely long term states
Has this ever caused a real problem to a real programmer in real life?
I have written a whole bunch of mission-critical code that reads and
generates UTF-8, and any correct implementation will have to deal
with the fact that there is no necessary connection between the
number of glyphs on the screen and bytes in its encoding. It would
be perfectly reasonable for an implementation to declare a
limitation, for example that it will not process than 32 trailing
modifiers on any character, and this would not cause problems in
production because sequences of such a length do not occur in the
encoding of any known text.
Which is to say, Ohta's statement about statefulness is true, but the
conclusion that this is a "problem" is erroneous. -Tim
Ietf mailing list