Re: perl(_at

On 4 Dec 2000, at 12:23, Jarkko Hietaniemi wrote:

Also, making the simplification Russian eq Cyrillic is not right.
There are I think dozens of nations/languages, ethnic groups, etc,
using Cyrillic.


Quite so. And "Cyrillic" ne "those Cyrillic letters used in Russian"; it's 
even more than those in such collections as WGL4 (which includes 
such things as Byelorussian U-breve, Macedonian K-acute, Ukrainian 
G-with-upturn, and Serbian lj and nj) -- the standard work (AFAIK), 
Musaev's "Alfavity yazykov narodov SSSR" lists alphabets for quite a 
few languages spoken on the former USSR, with such letters as T-TS 
ligature, ZH-diaeresis, shwa, h, and "Abkhazian h" which looks like a 
cursive Danish ae-ligature. (I'm not an expert, but some of the letters 
Musaev mentions appear not to be in Unicode, unless you consider 
them glyph variants of other latters; for example, he appears to 
distinguish between GHE-with-complete-stroke and GHE-with-right-
half-stroke, while the Unicode book only has a glyph GHE-with-
complete-stroke.

I'm slowly copying Musaev's book for my private reference; at the 
moment, it's in Word97 format using Unicode, using the Code2000 
font for the more exotic characters.

The point being (for the p5p list), I suppose, is that even if you have 
a "perfect", language-independent way of transliterating Cyrillic, if 
your scheme just does the "major" Cyrillic-using languages (say, 
Russian, Belorussian, Ukrainian, Macedonian, and Serbian), then it's 
not complete. It also needs to do Bashkir, Azerbaidjani, Khanti, &c.

Cheers,
Philip
-- 
Philip Newton <pnewton(_at_)gmx(_dot_)de>

Re: perl(_at_)7979