[ An earlier attempt at forwarding failed because of my typo, apologies
if you get this several times: and please reply to *THIS* message, not
the one with broken perl-unicode address. ]
(Background for the people joining us at the perl-unicode(_at_)perl(_dot_)org:
I proposed a locale called Locale::Date, which enables one to speak
e.g. fluent Albanian, at least as far as names of months and weekdays go.
Andreas Koenig get worried when I mentioned that I encode the strings
in Latin-1, Latin-2, etc. and wondered why I do not use UTF-8)
Andreas J. Koenig writes:
On Mon, 11 Jan 1999 17:07:36 +0200 (EET), Jarkko Hietaniemi
<jhi(_at_)iki(_dot_)fi> said:
jhi> Oh yes, I forgot to mention: the data structure supports *multiple*
jhi> encodings for, say, Greek month names (e.g. either
jhi> Latin-1-transliterated or Latin-7). Currently I have only Latin-1
jhi> encodings, because it is sort of easier with my keyboard :-)
The terminal is a separate problem. And I haven't tried the solutions
suggested by Larry in the early days of the perl-unicode list.
jhi> So supporting multiple encodings is no problem.
Are you sure, this is wise? The whole hocus pocus of multiple
encodings for *dates*? Isn't that going to reproduce the recoding
Methinks you are not thinking clearly. Let me take e.g. Russian.
It can be *encoded* in several ways.
KOI8R
CP855
CP866
CP1251
These are all valid and true Cyrillic. But you can also
*transliterate* Cyrillic script into Latin script and encode it in
some Latin-X. Ditto for the even more eastern languages like the CJK,
Thai, Arabic, Hebrew, ...
problems on a limited problem domain? Isn't Unicode the answer to
exactly this problem? I still believe putting the burden on the
Unicode modules is the way to go. If they want an encoding that isn't
Unicode doesn't solve a *language* problem...Unicode is defined for
*scripts*: German and French use the same script. Yiddish can be
encoded in Latin-1 (or almost Western European Latin), or in Latin-7
(or any Hebrew encoding).
The main point of the package is having mappings like
"7th month of the year in Icelandic"
"3rd day of the week in Finnish"
The main point is NOT the encoding: if I could, I would avoid the
issue completely. But I cannot. I am just keeping the data structure
open enough to allow for multiple different encodings. For the
moment, because it's easier and more useful both for me and for users,
I encode in Latin-X. Later, when Unicode has conquered the world,
it's no problem to add UTF-8 encoding.
Andreas, I do not see you writing your name as König :-)
supported by the Unicode modules, they should help Gisle instead of
putting burden on *dates*.
I repeat: Unicode does not have (almost) anything to do with languages
or national standards concerning locale-ish issues. "Almost" because
it does define character types (ctype) and a (draft of) collation
ordering. But they are Unicode character classes and Unicode collation
ordering. They have nothing to do with, say, the differences
between French, German, and Swedish collation orderings.
--
andreas
--
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen