ietf-822
[Top] [All Lists]

Re: 10646, and all that

1993-03-12 11:25:26
In 
<9303120845(_dot_)AA17283(_at_)necom830(_dot_)cc(_dot_)titech(_dot_)ac(_dot_)jp>,
 Masataka Ohta wrote:
Finally, there is a significant difference between the way that the ISO 646
family use the same codepoints to mean completely different characters, and
the way that DIS 10646 maps similar glyphs onto a single character...

The DIS does not say that the correnponding CJK characters are
the same single character. Instead, it says that the same code point
is assigned to the different "graphic symbols".

Can you cite some text from the DIS on which you base this claim?
After you said something similar, earlier this week, I asked on
the Unicode and ISO10646 mailing lists, and was assured that
ISO-10646 retains Unicode's notion that one code point is exactly
equivalent to one (possibly unified) character, and that
nationalized ideographs are treated as glyph variants.

Can you explain why you are insisting that the language name be contained in
the character set?  What is the advantage of having the language name part
of the character set name, instead of as a separate parameter?

Because separation problem is ISO 10646 specific and not a general issue of
charset, it is absurd to introduce a new concept.

The issue is not ISO-10646-specific.  Examples have been
presented for which language information would be useful
regardless of the character set.

Moreover,

      Content-Type: charset=iso-10646
      Content-language: Chinese, Japanese

is completely meaningless for display purpose.

Indeed.  (Personally, I would never have thought of allowing a
list of languages in conjunction with an untagged body, for
exactly this reason.)  However, ...

So, do you want to introduce another confusion?

...we are never going to be able to eliminate all possibility
for confusion.  ("If a truly idiot-proof system is ever devised,
Nature will spontaneously evolve a higher grade of idiot which is
able to subvert it.")  A body-scope language tag may introduce
some potential for confusion, but it replaces the more confusing
and less workable notion of trying to encode language matrices in
the character set name.

                                        Steve Summit
                                        scs(_at_)adam(_dot_)mit(_dot_)edu

<Prev in Thread] Current Thread [Next in Thread>