Aaron Stone wrote:
It's not strict yet (I'll cross that bridge when we agree on where it is ;-),
it just translates the hex values to utf-8. And now, counting from 0-9 in
Western Arabic, Eastern Arabic and Amharic (thanks unicode.org/charts!)...
Converting [${unicode:30 31 32 33 34 35 36 37 38 39}]
to [0123456789] length 11
Converting [${unicode:06f0 06f1 06f2 06f3 06f4 06f5 06f6 06f7 06f8 06f9}]
to [۰۱۲۳۴۵۶۷۸۹] length 21
Converting [${unicode:1369 136a 136b 136c 136d 136e 136f 1370 1371 1372}]
to [፩፪፫፬፭፮፯፰፱፲] length 31
(Are there any number systems up in the four bytes per symbol ranges?
Yes:
10A40..10A43 ; Digit # No [4] KHAROSHTHI DIGIT ONE..KHAROSHTHI DIGIT FOUR
104A0..104A9 ; Decimal # Nd [10] OSMANYA DIGIT ZERO..OSMANYA DIGIT NINE
)
If anybody would like to use my code, I'd be happy to make it available
without restriction. It's all of 100 lines, and most of the fun was
generating utf-8 by hand.
BTW, I've just implemented encoded-character myself.