perl-unicode

please break bleadperl

2001-05-02 06:55:03
        For those not up to speed on their perl5-porters jargon,
        'bleadperl' is the latest "bleeding edge" developer perl.
        You can garner the most recent developer snapshots from
        ftp://ftp.funet.fi/pub/languages/perl/snap/
        They should Configure and compile cleanly at least in
        Unix platforms.  Configure with -Dusedevel to get past
        a sanity check.  Do not install into production use.

I would like to think that Perl 5.8 is going to deliver the Unicode
promise for Perl.

Now, I know that many things Unicode still do not exist or are not
working and are unlikely to appear or fix themselves unless someone
starts working on them.  Such Unicode things include normalization and
collation, and I'm pretty certain case mappings are not implemented,
either (use charnames ':full'; uc("\N{LATIN SMALL LETTER SHARP S}") eq "SS";
that kind of thing).  Please don't complain about those unless you
come bearing patches :-)  On Unicode Character Database we are at
the Unicode 3.1 level.

        Note also that in case you have issues with the current
        Unicode model, which does expose the internal encoding,
        that is, UTF-8, use bytes, and all that, I'm sorry but
        we are not going to rethink that before the Perl 5.8.
        Firstly, the current model is what Larry wants, and what
        the Camel 3 says.   Deeper issues about Unicode model belong
        to the realm of perl 6.  Secondly, we don't have the time
        to rip everything apart before I want to release Perl 5.8,
        which is this summer.  Thirdly: we did have rather painful
        discussion/review of the current model earlier this year,
        I would rather not open that can of worms for a while.

But other than those, I have this fallacy that we are in pretty good
Unicode shape in the developer snapshots.  Please prove me wrong.

Now is your chance to shake out Unicode problems so that Perl 5.8
will just work with Unicode.

Try to break the Unicode handling.  Mix 'byte' data and 'character'
(Unicode) data.  Use functions and operators, especially regex
operators, on Unicode data.  (On I/O you may or may not be familiar
with the new Encode and PerlIO modules: with PerlIO you may do things
like encoding conversions on the fly while doing I/O, and with Encode
you may do such things without doing I/O.)  Try to find spots where
some obvious functionality is missing.  Try to find bad or missing
documentation.  Send perlbug reports.  As usual, patches
(sent to perl5-porters) welcome.

-- 
$jhi++; # http://www.iki.fi/jhi/
        # There is this special biologist word we use for 'stable'.
        # It is 'dead'. -- Jack Cohen

<Prev in Thread] Current Thread [Next in Thread>