Philip Guenther writes:
It would arguably be better to use a regexp of "\p{Cyrillic}", as
there's apparently a couple Cyrillic characters outside the Cyrillic
block now. To quote
http://www.unicode.org/Public/UNIDATA/Scripts.txt:
1D2B ; Cyrillic # L& CYRILLIC LETTER SMALL CAPITAL EL
1D78 ; Cyrillic # Lm MODIFIER LETTER CYRILLIC EN
...but those aren't in the koi8-r tables I've seen, so they may be
irrelevant to the goal.
Koi8-r is (almost) just the Russian subset of Cyrillic. The other
languages using Cyrillic have other subsets.
I think en is for Ukrainian, but don't hold me to that.
Arnt