perl-unicode

Re: should \d match *all* the digits?

1999-08-10 13:41:32
jhi(_at_)iki(_dot_)fi writes:
: Hi, all you Unicoders.
: 
: Daniel Yacob who has submitted a couple of patches for the Perl Unicode
: support (especially patches related to syllabaries like Ethiopic and
: several Amer-Indian languages), expressed wonderment over the fact
: that currently the \d a.k.a. \p{IsDigit} a.k.a. [[:digit:]] does not
: match *all* the digits, that is, digits not only the 0-9 beckoning
: to us from the ASCII world, but also things like
: 
: 00B2;SUPERSCRIPT TWO;No;0;EN;<super> 0032;2;2;2;N;SUPERSCRIPT DIGIT TWO;;;;
: 00B3;SUPERSCRIPT THREE;No;0;EN;<super> 0033;3;3;3;N;SUPERSCRIPT DIGIT 
THREE;;;;
: 00B9;SUPERSCRIPT ONE;No;0;EN;<super> 0031;1;1;1;N;SUPERSCRIPT DIGIT ONE;;;;
: ...
: 0966;DEVANAGARI DIGIT ZERO;Nd;0;L;;0;0;0;N;;;;;
: 0967;DEVANAGARI DIGIT ONE;Nd;0;L;;1;1;1;N;;;;;
: 0968;DEVANAGARI DIGIT TWO;Nd;0;L;;2;2;2;N;;;;;
: 0969;DEVANAGARI DIGIT THREE;Nd;0;L;;3;3;3;N;;;;;
: ...
: 1369;ETHIOPIC DIGIT ONE;Nd;0;L;;1;1;1;N;;;;;
: 136A;ETHIOPIC DIGIT TWO;Nd;0;L;;2;2;2;N;;;;;
: 136B;ETHIOPIC DIGIT THREE;Nd;0;L;;3;3;3;N;;;;;
: 136C;ETHIOPIC DIGIT FOUR;Nd;0;L;;4;4;4;N;;;;;
: ...
: 
: What say you?

The intent was that \d match all decimal digits (Nd), but not other
numbers (No), such as superscripts.  Basically, can you do a tr///
on \d+ and feed it to atoi() meaningfully?

See section 4.6 in the Unicode Standard 2.0 for more details.  I don't
know if this has been modified for 3.0.

Larry

<Prev in Thread] Current Thread [Next in Thread>