perl-unicode

Re: the dangers of Unicode

2000-10-23 23:52:43


Jarkko Hietaniemi wrote:

On Mon, Oct 23, 2000 at 10:02:44AM -0700, 
bstell(_at_)ix(_dot_)netcom(_dot_)com wrote:
The bottom line of this argument is that we should only
support ascii (read English) or the secutity code
will be harder to write.

I wouldn't read the articule in such a desperate tone.  All it says is
that Unicode is much, much harder to get right than ASCII because
Unicode is much more complex and even with the decades of practice for
the latter people still don't get it right.  So we should tread
carefully and be extra paranoid.

I agree with your statement that we will have new security issues 
with Unicode and will have to exercise care.

The article  http://www.counterpane.com/crypto-gram-0007.html#9
however is clearly a "Chicken little, 'the sky is falling'" argument 
and should be addressed as such. Let me quote the final statement in 
that section as it conveys the general tone of the section:

        "Unicode is just too complex to ever be secure." 

Unicode programming is actually simpler than trying to handle all the
various encoding that are currently used in the world. 

We have clear choices: 

    1) Stick with ascii (English) which is clearly the easiest and just
       ignore the rest of the world.

    2) Make Perl work for other languages around the world
       2.1) Use pre-existing encodings. There are a lot of different
            encodings currently in use. Trying to support even a fair 
            subset of them is a mess.  If you've never done this I 
            hope you never have to. Having done this at Netscape on and
            off for a while I am familiar with the difficulties.

       2.2) Use Unicode which provides the least difficult path
            to multi-language support. Java, Javascript (ECMAScript),
            Internet Explorer, and Netscape all use Unicode for exactly 
            that reason. If they just needed to support English they 
            would (and when they started out they did) use just ascii. 
            It executes much faster and is easier to program if all 
            you want is English.


If someone has other constructive suggestions about the best way to
help Perl grow with the web and support more languages then please 
speak up.

Brian Stell

<Prev in Thread] Current Thread [Next in Thread>