Bruce> I'm still not sure that FreeType is fully handling some of the TTF
Bruce> Unicode. I know that ttf2bdf for platform 3, encoding 1
Bruce> (Windows/Unicode) is generating some grungy fonts at times, which
Bruce> has made me cautious about installing a FreeType enabled
Bruce> fontserver. I'm still not sure of Mac/Unicode in-font tabling, but
Bruce> most Fontographer fonts floating around don't have that.
There was a recent version that had some fairly serious rendering bugs. That
has been fixed for at least three months now, and a new version is going to be
released in the near future. The most recent distribution is always available
on http://www.freetype.org.
Bruce> Also, with some (e.g. indic) languages, a common entry method is
Bruce> romanized syllables. This type of encoding should also be parsable
Bruce> for autoconversion to Unicode, as it is to the many encodings
Bruce> supported by "itrans", and its successor (in development)
Bruce> "iscript".
This would be *very* nice to have, but it will take a sophisticated system to
handle the wide variety of encodings out there. A good example is my
converter from Naidunia Devanagari to UCS-2, written in Perl:
http://crl.nmsu.edu/~mleisher/nai.html
This sort of thing pretty much takes a programming language and a bit of a
priori knowledge about the documents being converted. Other encodings such as
VIQRI (Vietnamese), the old N-byte Hangul, and the various Arabic and Persian
font encodings need their own kind of special handling as well.
Bruce> At the same time, TSCII is growing pretty fast for Malaysia, but
Bruce> "TAB" encoding has been endorsed by the Maylay government as the
Bruce> standard Tamil encoding - RATHER than the assigned Unicode block.
Bruce> I haven't looked (yet) to see if this is merely a truncation to the
Bruce> low-order byte of the Unicode or what.
I haven't heard about these in a while. Do you have pointers you can send?
Also, I lost all my pointers to the RIT encoding for Telugu. If you happen to
have any of those, it would be greatly appreciated!
Bruce> which is distributed with dvedit, there is an enormous overlap in
Bruce> the glyphs, but no relationship I can see in the encodings. Yet as
Bruce> I recall, the Jagran encoding is being used in some indic on-line
Bruce> newspapers at the moment.
I am not familiar with dvedit. Any pointers?
You have hit the nub of the problem. With Indic fonts, just about every font
has a different glyph set. The problem is less prevalent for other scripts,
but exists. I have been working on a small system for writing simple
rendering rules for these cases.
http://crl.nmsu.edu/~mleisher/contextnew.pdf
A much more sophisticated mapping/rendering system is available in the OTP
module of the freely available Omega/Lambda (TeX/LaTeX) typesetting system.
http://www.ens.fr/omega
[related page]
http://www.fluxus-virus.com
-----------------------------------------------------------------------------
Mark Leisher
Computing Research Lab I have never made but one prayer to God,
New Mexico State University a very short one:
Box 30001, Dept. 3CRL "Oh Lord, make my enemies ridiculous."
Las Cruces, NM 88003 And God granted it. -- Voltaire, letter