Re: Implementing encoded-character


Aaron Stone wrote:

It's not strict yet (I'll cross that bridge when we agree on where it is ;-),
it just translates the hex values to utf-8. And now, counting from 0-9 in
Western Arabic, Eastern Arabic and Amharic (thanks unicode.org/charts!)...

Converting [${unicode:30 31 32 33 34 35 36 37 38 39}]
       to [0123456789] length 11

Converting [${unicode:06f0 06f1 06f2 06f3 06f4 06f5 06f6 06f7 06f8 06f9}]
       to [۰۱۲۳۴۵۶۷۸۹] length 21

Converting [${unicode:1369 136a 136b 136c 136d 136e 136f 1370 1371 1372}]
       to [፩፪፫፬፭፮፯፰፱፲] length 31

(Are there any number systems up in the four bytes per symbol ranges?

Yes:

10A40..10A43  ; Digit # No   [4] KHAROSHTHI DIGIT ONE..KHAROSHTHI DIGIT FOUR
104A0..104A9  ; Decimal # Nd  [10] OSMANYA DIGIT ZERO..OSMANYA DIGIT NINE

)

If anybody would like to use my code, I'd be happy to make it available
without restriction. It's all of 100 lines, and most of the fun was
generating utf-8 by hand.

BTW, I've just implemented encoded-character myself.

Previous by Date:	Re: IESG comments on 3028bis, Tony Finch
Next by Date:	Re: IESG comments on 3028bis, Alexey Melnikov
Previous by Thread:	IESG comments on 3028bis, Alexey Melnikov
Next by Thread:	Deployment of SIEVE, Hannes Tschofenig
Indexes:	[Date] [Thread] [Top] [All Lists]