ietf-822
[Top] [All Lists]

Re: Non-ASCII hdrs

1991-10-19 09:13:54
On Sat, 19 Oct 91 13:36:49 +0100, Keld J|rn Simonsen wrote:
It is right that the context is just one or two characters?
Or is this different from Chinese to Japanese?
I may have some 200.000 combinations of CJK characters that should
uniquely define words and meanings for CJK.

Chinese and Japanese are totally unrelated.  Chinese is closer to English than
it is to Japanese.  Korean is yet a third case, but at least there is some
vague similarity -- vehemently denied by Korean and Japanese linguists but
apparent to more neutral third parties -- between them.

In Chinese, each character represents a monosyllabic utterance with a tone.
There are five tones in Chinese.  The combination of utterance/tone represents
one of a set of homonyms which you can determine only by context.
Fortunately, most Chinese words are compounds which generally provides enough
context.

In Japanese, each character represents a polysyllabic utterance with accents.
These characters were adopted from Chinese for a completely different system.
Chinese characters were a terrible writing system to use for Japanese, but the
modern language has become dependent upon them.  The Japanese assign a set of
utterance values to a character (at times as many as 12) depending upon how
the character is used.  Similarly, many characters (at times over 100) have
the same utterance value.

Japanese has a horrible number of homonyms including in compound words.  Many
Japanese comedies have been based on the misinterpretation of homonyms.  Often
adult Japanese draw the Chinese character for what they are saying in the air
to make clear what they are saying.  The results without this aid can be
comic, as anyone who has listened to a telephone conversation can attest.

To make things worse, the same character(s) can be read differently depending
upon what word they represent.  This is especially true in names.  Vastly
different names have different `spellings'.  In this case, an encoding system
can offer too much information.


<Prev in Thread] Current Thread [Next in Thread>