Re: 10646, UTF-2, etc.

... "the hard part is making the code understand that octets
and characters are not synonymous".  Once that is done, the details --
how the two are related, whether a character is 16 or 32 bits, etc. --
are very much secondary, particularly if libraries etc. are designed
to hide the implementation details properly.


Untrue. 16 bit or 32 bit are not implementation details.
With 16 bit wchar_t, you can write
      array[(unsigned)char_code]


Not on my pdp11, you can't!

But in any case, this misses the point.  You can always write code that
depends on details, if you try.  The point is that if you make a modest
effort to write clean code, then 16 bits vs 32 bits *is* a detail, one
that can be changed by changing a header file and recompiling.  Avoiding
a few inappropriate practices -- such as assuming that a particular
datatype is small enough to be usable as an array index -- suffices to
make code independent of this particular detail.

                                         Henry Spencer at U of Toronto Zoology
                                          
henry(_at_)zoo(_dot_)toronto(_dot_)edu   utzoo!henry

<Prev in Thread]	Current Thread	[Next in Thread>
10646, UTF-2, etc., henry Re: 10646, UTF-2, etc., Masataka Ohta Re: 10646, UTF-2, etc., John C Klensin Re: 10646, UTF-2, etc., henry <= Re: 10646, UTF-2, etc., Masataka Ohta Re: 10646, UTF-2, etc., Steve Summit Re: 10646, UTF-2, etc., Masataka Ohta Re: 10646, UTF-2, etc., henry Re: 10646, UTF-2, etc., Erik M. van der Poel Re: 10646, UTF-2, etc., henry

Previous by Date:	Re: Updated MIME "fix" list, John C Klensin
Next by Date:	Re: 10646, UTF-2, etc., henry
Previous by Thread:	Re: 10646, UTF-2, etc., John C Klensin
Next by Thread:	Re: 10646, UTF-2, etc., Masataka Ohta
Indexes:	[Date] [Thread] [Top] [All Lists]