Re: 10646, and all that

Erik, please have the curtesy to read my posting as carefully as I
framed it


I'm sorry that that's the way it came across to you.  (Really.)

the european viewer is *accustomed* to seeing all of the 
variations [of Roman charsets], and *habitually* perceives them as
similar.


  The unified Han glyphs *are* similar.  I challenge you
  to come up with a Unicode Han character that has
  dissimilar glyphs when rendered in the traditional CJK
  printing styles.


ignores my main point


Actually, I *did* notice your main point, and I was impressed with the
way you see it.  The fact that I did not address your main point
explicitly is an error on my part, and I apologize.

I addressed your main point *implicitly* by talking about something
else.  I.e. I was sort of refuting your main point.  (Lame excuse.)

I dont deny the similarities, I am stressing
the possible effects of the differences, and basically I am
postulating that it is *possible* that the message gets garbled.


Of course it's possible that the message gets garbled!  You can say
the same about ASCII.  If I say "There are many ways to write 'a'; for
example, 'a', 'a', and so on.", then the reader will be confused.
This is because I have not included font switching commands (not that
this is possible in today's email anyway).

Of course, we can argue forever about whether or not a particular Han
character should have been unified, and we can also argue forever
about whether the Latin/Greek/Cyrillic "A" should also be unified,
even if the Latin "P" and Cyrillic "P" (which is pronounced like an
"R") are not unified.  We have seen long debates about character vs
glyph, language vs script, etc etc etc.

My point is that it is too late to argue about these things now since
Unicode and 10646 are frozen.  So we should instead talk about how to
salvage the wreck.  I.e. how we can still make the best of the
situation.  Others may argue that we should ignore 10646 completely,
and my reply to that is that that is their choice, and the future
shall decide a winner, or it shall show that coexistence is possible
(like the TCP/IP vs OSI case (though many would argue that there is a
clear winner here)).

Outside of etymologists, few orientals posses an encyclodepiac view of
all possible HAN, and, I suspect, are not *accustomed* to reading
foreign variants.


If they set their Han font, then they *won't* see the foreign variant.
This becomes a problem if the sender *wanted* the receiver to see the
variant, but the responsibility for that lies solely with the sender.
I.e. he shouldn't have used Unicode, or he should have included font
switching commands, or he should have drawn an "ASCII picture" the way
I did a while ago with a couple of different glyphs for "a".

Then there's the argument that East Asians attach a great deal of
importance to the precise way they write their names.  If the Japanese
are concerned about this, then they have the same problem with their
very own standard, JIS X 0208.  So Unicode isn't introducing any "new"
problems here.  I.e. if you don't want to use Unicode, don't.  (Use
fax. :-)

Given the vehemance shown by Ohta-san, and my personal difficulty with
less-familiar variants of the VERY MUCH SMALLER Roman character set
(eg, english court hand, fractur)


If you prefer to see your ASCII in Helvetica or Courier, then you set
your font, right?  Why would you see it in an unfamiliar font like
Fraktur?

Please also consider how the user is expected to interact with a UA
which employs a folded set af HAN characters.  If the program
correctly intuits the font for each, then no problem.


No program can correctly intuit fonts from "plain" text.  You cannot
create information from nil.  If the sender does not include font
info, then you cannot deduce the font(s).  You can use heuristics, but
then you also assume the responsibility for any mistakes.

I am not normally a violent person (inspite of my 2 meter height and
21 stone weight), but it is precisly when my tools misbehave that I am
most tempted.


This is a matter of education.  Senders, today, are told that their
ASCII message may look slightly different on the receiver's display.
Likewise, users of the Unicode character set will need to be educated
about the *possible* differences in the rendering of Han characters.
If they are not satisfied with their education, they might use ISO
2022, fax, Ohta-code, or whatever.  So be it.


(Liked your choice of "meter" and "stone". :-)


Regards,

Erik