ietf-822
[Top] [All Lists]

Re: 10646, and all that

1993-03-04 09:27:43
Agreed, but the wreck does need a bit of analysis if we are to provide
a viable fix, so *some* discussion seems in order.

Yes, but I doubt that that discussion belongs here.  I'm waiting for
the Chair to make a decision on this.


  Of course it's possible that the message gets garbled!

Thank you for admiting this, I had drawn the conclusion from my
(admittedly cursory) examination of the past 2 months traffic that
this point hadnt been yielded, or was being taken lightly by many of
the posters---and that Ohta-san was feeling rather frustrated over
just that lack of ackowledgement.

If I understand him correctly, Masataka is basically saying that
Unicode/10646 needs to be used together with language information to
get proper rendering.  He suggested putting the language info in the
charset parameter:

    charset=iso-10646-sanskrit-japanese

Presumably, what this means is that if the message contains any
characters from the Devanagari *script*, then those characters should
be rendered the way it is typically done in the Sanskrit *language*.
And similarly for the Han *script* and the Japanese *language*.

He was getting frustrated because other people were saying that you
can read the text even if it is rendered the way *other* languages
that use those scripts typically do the rendering.

Both positions are valid.  In some instances, a sender may want the
receiver to see the message a certain way.  In other instances, the
sender may want to let the receiver decide how the message looks.
(Personally, I cannot see how either position could possibly be
invalid.)


  If they set their Han font, then they *won't* see the
  foreign variant.

Here you are assuming a dumb UA, I am postulating a need for a smarter
one, for which we need to consider support mechanisms.

If the receiver *tells* the UA to try to guess what the sender meant,
then you need the heuristics.  If the receiver tells the UA to render
Han in a Japanese-style font, then the UA has no choice.

What kinds of "support mechanisms" do you have in mind for the
heuristics?


The point I was trying to make here is that I might very well feel
frustrated over an inability to mix fractur (ie, into a discussion
about german lute tablature), and thus would benefit from the same
mechanism.

As I said before, I don't deny the need for such mechanisms.  What I'm
denying is that such mechanisms are *required* for *plain* text.


  No program can correctly intuit fonts from "plain" text.
  You cannot create information from nil.

true, but we have more than nil to work with here, admittedly our
information is imperfect, but I think an assesment of the degree of
that imperfection is usefull, if only to convince the 10646/unicode
people of the need for embellishment

They are already convinced about the need for font info in "fancy"
text (i.e. *not* plain text).  They also believe in the need for
*language* info for e.g. spell-checkers.


or [to convince] ourselves for the need for an inline/parallel
information structure.

Yes.  There are at least two issues here.  One is the type of tag,
i.e. language tagging or font tagging.  Font tagging can be further
subdivided into specific, e.g. Ryumin Light, and less specific, e.g. 
Japanese (as Masataka has been suggesting).

The other issue is whether to do the tagging at the body-part level,
e.g. in the charset parameter, or within the body-part, e.g. the way
"richtext" does <bold>, <italic>, etc.


Unfortunatly, the research
does go a bit beyond the norm for IETF-822, still, we should solicit
it from some sister body, wish I knew which body that would be.

As far as "Unicode/10646 in *email*" is concerned, I imagine that a
group that falls under the umbrella of the IETF is appropriate.  But
for discussion of language/font tagging in Unicode/10646 in *general*,
i.e. not necessarily email, the proper forum could be the Unicode or
10646 mailing list.


  If the sender does not include font info, then you
  cannot deduce the font(s).

obviously, a sender should be encouraged to do so, a standard means
for doing it would provide that encouragement.

No, the sender should not be encouraged to include font info.  The
sender should be reminded of the fact that the receiver may have a
preferred way of looking at the text.  The sender should be told that
font info is allowed, but not always desirable.  (My opinion.)


Regards,

Erik


<Prev in Thread] Current Thread [Next in Thread>