perl-unicode

Re: Change Request for U+1361 Handling

2003-04-11 16:30:05
Hrrmm.. I think this is a conflict with the multiple contexts that a
punctuation can hold, and the single rigid context it gets assigned in
the UnicodeData.txt file.


variable names.  If you think the U+1361 should be part of the the set
of "word characters" (characters eligible for variable names and the
like), I propose you contact the Unicode consortium.  (After all, that
way the issue would get fixed also for other software, not just Perl.)

Looking into this, the UnicodeData.txt file appears to be out of date
for Ethiopic punctuation with respect to the revisions in the
PropValuesAliases.txt.  So I'll be reporting a number of updates.

What if, U+1361 were to be promoted to a 'Pc' (Connector Punctuation)?
This would be the same class as underscore (U+005F).  Line 627 of
unicore/mktables could be updated from:

  $Cat{Word}->$op($code)  if $cat =~ /^[LMN]/ || $code == 0x005F;

to:

  $Cat{Word}->$op($code)  if $cat =~ /^[LMN]/ || $cat eq "Pc";


Then no special exception is made for underscore, no double standards,
all connector chars are treated like connector chars.  Is this feasible?

/Daniel

<Prev in Thread] Current Thread [Next in Thread>