perl-unicode

Some Persian encodings unsupported in Encode module

2003-07-25 06:30:06
Perhaps this would have been already well known, but I happened to come across "CSets: Supplemental Unicode Mapping Tables".
        <http://crl.nmsu.edu/~mleisher/csets.html>
where, among others, you'll find mapping tables for Iran System, IRNA, ISIRI 2900 and ISIRI 3342, which are not supported in Encode 1.97. seemingly due to lack of documentation.

While ISIRI 3342 is a logical encoding, Iran System, IRNA and ISIRI 2900 are visual ones, so would need some algorithm in conversion into them.

In conversion from them, I think isolated, initial, medial and final forms should be replaced with a basic glyph omitting U+200C and U+200D. For example, in Iran System, both

        0x90    0x200C 0x0627   # ARABIC LETTER ALEF, isolated form
        0x91    0x200D 0x0627   # ARABIC LETTER ALEF, final form

should be converted into just U+0627. Moreover, it would be necessary to rearrange characters in reverse order per logical line, I guess.


Kino

<Prev in Thread] Current Thread [Next in Thread>
  • Some Persian encodings unsupported in Encode module, Kino <=