ietf-822
[Top] [All Lists]

Re: Non-ASCII hdrs

1991-10-19 05:37:32
EvdP wrote:

Japanese can be converted to mnemonics in such a way that they are
quite readable (if you can read the 26 letters of the English
alphabet).

An example of such a conversion would be:

      K1 K2 K3 -> Naka Hara Yasushi

K1, K2, and K3 are three Kanji (Japanese ideographic) characters.

The problem, however, is that you cannot convert back from "Naka Hara
Yasushi" to the Kanjis, because several Kanjis (say, K4, K57, K3001)
also map to "Naka".

One solution to this, is to put a number after the mnemonic:

      Naka1

Now, (as if that wasn't bad enough), we have the additional problem
that many Kanjis map to multiple pronunciations (mnemonics). For
example, the "Naka" above (K1), can also map to "Chuu", which varies
case by case, i.e. depending on the context.

It is right that the context is just one or two characters?
Or is this different from Chinese to Japanese?
I may have some 200.000 combinations of CJK characters that should
uniquely define words and meanings for CJK.

Another question: would Naka1 not be associated with the right
CJK character? and then be understood as Chuu? 
Mnemonic does not mean "pronounciation" but "aid to memory".

Keld, being a pretty optimistic sort of guy, keeps saying that he will
be able to come up with usable mnemonics for the ideographic
characters. I doubt it. Strongly. (And I should know. I speak Japanese
fluently, and read and write Japanese email everyday. I even live in
Japan.)

I already have defined names - or mnemonics - for CJK.
They are not that good in some respect, but they are unique
and very close to naming that JISC and ISO have made.
Mnemonics do not have to be related to pronounciation,
just some kind of thing that is easy to remember.
And CJK is not simple to remember as shown.

I am of cause aware of the problems that Erik has described
and I am looking into providing something better than just gibberish
to unextended equipment. This is indeed hard for CJK.
I am humbly making a try on it.
But if this does not succed for CJK, would that lead to no solution
at all for laguages, where adequate solutions exist? 
I would think that that would be a real pity.

Now, having said that Japanese text must be in a particular form of
ISO 2022 for it to be readable, I should also say that ISO 2022 cannot
be used in email headers, because it can contain unquoted special
characters like '<', '(', '"'. (And quoting would distort the
ideographic glyphs.)

Well, these things may be quoted.

Keld

<Prev in Thread] Current Thread [Next in Thread>