ietf
[Top] [All Lists]

Re: Internationalization and the IETF (Re: Will Language Wars Balkanize the Web?)

2000-12-07 12:20:04
From: Henk Langeveld <Henk(_dot_)Langeveld(_at_)holland(_dot_)sun(_dot_)com>

You know, it isn't that long ago that I realised that for many Americans,
"International" is synonymous with "Non-American".

That is as true as the observation that many who learn English as a
second language think that "international" is synonymous with using
the language of their few dozen million countrymen.  

It is a fact that the single international language of the late 20th and
early 21st is far more closely related to a subset of American English
than any other local language.  It is also a fact that only during my
lifetime has that odd situation developed.  If the world had asked you or
me to design an international language, I think either of us would have
done better.  But the first fact is all that matters.

If it makes your feel better, note that just as Latin was not exactly what
Italians spoke, the current international language is not exactly what is
spoken by citizens of the largest nation that calls itself The United
States of America (there are >1) and whose mother tongue is English.
Thanks to satellite TV and other forms of what the P.C. call cultural
imperialism, the modern difference are small, but they exist.


From: Dave Crocker <dhc2(_at_)dcrocker(_dot_)net>

Diacritical marks are no different from Cyrillic, Arabic, Greek, Hebrew,
Sanskrit, and other non-Latin character sets in not being not part of
the international language.  The goal of communicating is to communicate,
not wave flags in support of national languages.

In a sense, Harald's observation points out a case in which all those other 
sets very much ARE part of the "international" language.

If those are part of your "international language," then what characters
are not part of it?  It is Polically Correct to pretend we all speak,
read, and write a single language, but also hopelessly silly.


It does not matter whether readers understood the semantics of the strings; 
they needed to be able to see them.
         That is not national flag waving.
         That is global utility.

"Global unity" is a matter of everyone being able to communicate with
everyone else.  It has not only has nothing to do with each of us using
our favorite set of glyphs, but goes against it.  Each of us using our
favorite language *internationally* is a real Tower of Babel.

Being able use strings is not only a matter of being able to type their
characters.  Those of us who have studied languages with alphabets other
than what learned while young have discovered that just as the human ear
has difficulty hearing sounds outside our mother tongues, the human eye
has trouble seeing foreign glyphs.  If they're not yours, all of those
diacritical marks look the same or are invisible.

There are good reasons why the international lingua francas of previous
millenia have forced people to transliterate their native writings
instead of importing them wholesale.  MIME and 8-bit domain names are
mechanisms for importing wholesale instead of transliterating.  They're
good *locally*, but not *internationally*.

...
Technical standards work often gets distracted by trying to deal with 
issues that are outside the scope of reasonable technical standards 
work.  It should not be the task of such work to dictate or constrain users 
to only socially acceptable behavior.  That is a social task, not a 
technical one.

Yes.  So why do otherwise rational IETF particpants claim that
social and political notions such as "global unity" are somehow
related to MIME and IDN?

MIME and localized domain names are good and necessary, but only
locally or provincially, even when "locally" involves vast land
areas (e.g. Russian or Spanish) or billions of people.

Choosing to send various types of data requires making decisions about the 
context.  No technical standard can be designed to "automatically" 
determine when it is, or is not, appropriate to send that data, whether it 
is diacritical marks, kanji, or an excel spread sheet.  Even when the 
sender has information about recipient capabilities, social factors affect 
the choices.

Yes, so why do some MIME and *localized* domain name advocates claim
otherwise?  What is the pathology insisting that sending MIME to
international mailing lists makes sense?  Why do apparently rational people
claim that 8-bit binary domain are "international"?  Because they've been
infected with Political Correctness or because they don't want to dilute
political support among the unthinking for whatever they're advocating?


...
At least the recipient has the unintelligible data well isolated and 
labeled.  MIME did its job.

Yes, but the justification of the sender for using MIME to send
unitllitible data is crazy, since communication is averted while
resources, including the human recipient's time, are wasted.


...
The question is whether a coherent extension to DNS will be done in a 
fashion which will keep the DNS integrated, or whether this requirement 
produces an independent DNS.
         That's not flag-waving.
         That's multiple DNS namespaces.

yes.


I like much of the following:

We need to be careful to distinguish two different requirements.  One is 
for a mechanism to encode domain names in non-ascii character sets.  The 
second is for an equivalence mapping from non-ascii domain names into ascii 
domain names.  The former is so that the technical and operational aspects 
of the DNS remain coherent.  The latter is so that everyone has a way to 
reach a particular domain, even if they cannot generate the non-ascii form 
of the name.

The extreme form of the latter task involves ascii encodings that are 
"comfortable" for human users; that requirement is not solved in human 
non-technical situations.  I believe that the example of alternate choices 
of "jin" and "gin" as representations for some chinese character(s) was 
used.  Hence this extreme form of the task is not going to be solved by 
lowly IETF protocol designers.

At best, use of ACE-like encodings permits an ascii representation, albeit 
one that is "uncomfortable".  It is as far as the IETF should go in trying 
to permit a "universally accessible" form for all domain names.

Interestingly, we do not need to have all domain names exchange and stored 
in an ACE form, forever.  Just as MIME is able to support pure binary 
encodings, so can the DNS.  The ACE form can be mapped to when needed.

Where we differ most in the Politically Correct cant in support of it.


Vernon Schryver    vjs(_at_)rhyolite(_dot_)com