On Mon, 11 Jan 1999 18:35:25 +0200 (EET), Jarkko Hietaniemi
<jhi(_at_)iki(_dot_)fi> said:
Are you sure, this is wise? The whole hocus pocus of multiple
encodings for *dates*? Isn't that going to reproduce the recoding
jhi> Methinks you are not thinking clearly. Let me take e.g. Russian.
jhi> It can be *encoded* in several ways.
jhi> KOI8R
jhi> CP855
jhi> CP866
jhi> CP1251
or UTF8.
jhi> These are all valid and true Cyrillic.
Even if encoded in UTF8.
jhi> But you can also
jhi> *transliterate* Cyrillic script into Latin script and encode it in
jhi> some Latin-X. Ditto for the even more eastern languages like the CJK,
jhi> Thai, Arabic, Hebrew, ...
And all this is still true for UTF8.
problems on a limited problem domain? Isn't Unicode the answer to
exactly this problem? I still believe putting the burden on the
Unicode modules is the way to go. If they want an encoding that isn't
jhi> Unicode doesn't solve a *language* problem...
Agreed.
jhi> ...Unicode is defined for
jhi> *scripts*: German and French use the same script. Yiddish can be
jhi> encoded in Latin-1 (or almost Western European Latin), or in Latin-7
jhi> (or any Hebrew encoding).
Or in UTF8. Unicode is about forgetting the myriads of encodings.
jhi> The main point of the package is having mappings like
jhi> "7th month of the year in Icelandic"
jhi> "3rd day of the week in Finnish"
jhi> The main point is NOT the encoding: if I could, I would avoid the
jhi> issue completely.
That's what I believe: you could avoid the issue completely by
restricting yourself to UTF8. Because Unicode is about supporting them
all at once.
jhi> But I cannot.
You don't explain, why.
jhi> I am just keeping the data structure
jhi> open enough to allow for multiple different encodings. For the
jhi> moment, because it's easier and more useful both for me and for users,
jhi> I encode in Latin-X. Later, when Unicode has conquered the world,
jhi> it's no problem to add UTF-8 encoding.
It's not a matter if Unicode will conquer the world. It's the matter,
that Unicode solves a problem that you plan to treat with a
complicated data structure that appears superfluous to me. If you can
explain, why this is necessary or just more efficient, I'm all ears.
jhi> Andreas, I do not see you writing your name as K旦nig :-)
Just because you haven't studied
http://www.stadtplandienst.de/query?HELP=software;LANGUAGE=ja
carefully :-)
Granted, in everyday life I quite often use latin1, but on
professional projects that might turn out multi-language some day, I
tend to restrict all components to be UTF8.
supported by the Unicode modules, they should help Gisle instead of
putting burden on *dates*.
jhi> I repeat: Unicode does not have (almost) anything to do with languages
jhi> or national standards concerning locale-ish issues. "Almost" because
jhi> it does define character types (ctype) and a (draft of) collation
jhi> ordering. But they are Unicode character classes and Unicode collation
jhi> ordering. They have nothing to do with, say, the differences
jhi> between French, German, and Swedish collation orderings.
That's a completely different problem domain. The thread started
with talking about encoding, and that's where Unicode rules. imho.
--
andreas