ietf-822
[Top] [All Lists]

Re: Language tags and 10646

1993-03-08 08:26:37
    1. What percentage of e-mail messages will require full
      multilingual capabilities, and

I think the answers are "high" ...

I don't believe it.  I think the modal email message today is intra-
organization and that most of those messages are monolingual.  I think
that the modal message when intra-organization ones are eliminated is
still probably intra-country, and that most of those are monolingual,
too.

We are trying to solve an important marginal case here, but a marginal
one, not something that is key for the vast majority of everyday uses.

I do.  Permitting a list seems to me to relegate the new tag to a
purely advisory capacity, upon which no automated decisions could
be based.
    I think is it going to be "purely advisory" no matter what you do. 
The alternative is to require it of senders and to specify what
receivers do when they get it.  I don't think either works.

To return to an example I brought up a few days ago, <o-diaeresis> [=F6]
can be transliterated as "oe" if it appears in German text, but
not in English, where it should be replaced with a single "o"
(assuming the diacritical form is not available).
   Or with "-o".  You pick four US-English dictionaries and style
manuals and you may learn that co=F6perate is an obsolete form and the
cooperate is the normative spelling.  Or you may discover that, while
the form with the diaeresis is normative, the preferred alternate
spelling is "co-operate".
   Providing some hints is a good thing.  Thinking that you are going to 
solve all of these problems by anything as simple as a language tag (or
even language/country) is naive.   No amount of language tagging is
going to be a substitute for either highly-specific markup of
polylingual constructions and desired interprepretation/presentation or
for a variation on 10646 that meets the various goals people are trying
to project onto it.

If the body-level language tag is limited to a single language, it
can be defined as "the only language, or the predominant language,
with which the message is composed", and a mail display program
can render the entire message using conventions appropriate to
that language without thwarting any expectations.
   Certainly, if there is only one language present, or the sender
believes that one language predominates, then the sender will be
providing much more satisfactory information to the receiver by
specifying only one.  We need to require that, we need only rely on good
sense--senders who provide receivers information that the latter can't
interpret will discover that the receivers don't interpret it.

   But suppose I am sending a message that is mixed English and Russian,
mostly the latter.  Cyrillic isn't unified with Roman in either 10646 or
in 8859-5.  But, if I am forced to "one language", I either have to lie
and say "English" to get the disambiguations of your example, or I have
to say "Russian" and lose all of those disambiguations (although not 
the ability to differentiate between Russian and other Cyrillic-using
Slavic languages, perhaps).

we would
be suggesting that software somewhere might be going to treat the
various languages within the message appropriately when (without
explicit, finer-grained tags) it obviously can't.
   It is a hint from sender to receiver.  If the receiver can't or won't
interpret the hint, too bad.  If the sender needs to *control* the
receiver's action, then this type of hint mechanism is inadequate and
something stronger is needed.

   If it doesn't solve any useful problems, then there is the
alternative of deciding 10646 is sufficiently broken that we simply
should not permit it in text/plain--that all multilingual texts require
an applications type and explicit language markup or a character set
that doesn't do any language unification at all.

    john

<Prev in Thread] Current Thread [Next in Thread>