ietf
[Top] [All Lists]

Re: Internationalization and the IETF (Re: Will Language Wars Balkanize the Web?)

2000-12-07 05:50:06
At 01:58 AM 12/7/00 -0700, Vernon Schryver wrote:
> From: Harald Alvestrand <Harald(_at_)Alvestrand(_dot_)no>
> it may have escaped the notice of some that a fair bit of the discussion on
> diacritcs was carried out using live examples,

Diacritical marks are no different from Cyrillic, Arabic, Greek, Hebrew,
Sanskrit, and other non-Latin character sets in not being not part of
the international language.  The goal of communicating is to communicate,
not wave flags in support of national languages.


In a sense, Harald's observation points out a case in which all those other sets very much ARE part of the "international" language.

The live examples were a) intended, b) appropriate, and c) successful.

It does not matter whether readers understood the semantics of the strings; they needed to be able to see them.

        That is not national flag waving.

        That is global utility.


> MIME character sets is an example of a battle fought and won.

When MIME is used to pass special forms among people whose common
understandings including more or other than ASCII, MIME is a battle
fought and won.
When MIME is used to send unintelligible garbage, it is a battle fought
and lost.

Technical standards work often gets distracted by trying to deal with issues that are outside the scope of reasonable technical standards work. It should not be the task of such work to dictate or constrain users to only socially acceptable behavior. That is a social task, not a technical one.

Choosing to send various types of data requires making decisions about the context. No technical standard can be designed to "automatically" determine when it is, or is not, appropriate to send that data, whether it is diacritical marks, kanji, or an excel spread sheet. Even when the sender has information about recipient capabilities, social factors affect the choices.

        So sending such data in MIME inappropriately
        is STILL an example of a battle fought and won.

At least the recipient has the unintelligible data well isolated and labeled. MIME did its job.



At 08:19 AM 12/6/00 -0500, vint cerf wrote:
Even if we introduce extended character sets, it seems vital
that there be some form of domain name that can be rendered
(and entered) as simple IA4 characters to assure continued
interworking at the most basic levels. This suggests that
there is need for some correspondence between an IA4 Domain
Name and any extended characterset counterpart.


The same task is at issue for the DNS as it was for MIME. We need a mechanism for labeling and encoding DNS strings and, I believe, we need it to be added to the existing DNS.

        Users of those strings will be all over the world,
        not just in a particular locale.

The need for this capability is massive and immediate.

There WILL be a solution deployed.  In fact there already is.

The question is whether a coherent extension to DNS will be done in a fashion which will keep the DNS integrated, or whether this requirement produces an independent DNS.

        That's not flag-waving.

        That's multiple DNS namespaces.

We need to be careful to distinguish two different requirements. One is for a mechanism to encode domain names in non-ascii character sets. The second is for an equivalence mapping from non-ascii domain names into ascii domain names. The former is so that the technical and operational aspects of the DNS remain coherent. The latter is so that everyone has a way to reach a particular domain, even if they cannot generate the non-ascii form of the name.

The extreme form of the latter task involves ascii encodings that are "comfortable" for human users; that requirement is not solved in human non-technical situations. I believe that the example of alternate choices of "jin" and "gin" as representations for some chinese character(s) was used. Hence this extreme form of the task is not going to be solved by lowly IETF protocol designers.

At best, use of ACE-like encodings permits an ascii representation, albeit one that is "uncomfortable". It is as far as the IETF should go in trying to permit a "universally accessible" form for all domain names.

Interestingly, we do not need to have all domain names exchange and stored in an ACE form, forever. Just as MIME is able to support pure binary encodings, so can the DNS. The ACE form can be mapped to when needed.

d/

=-=-=-=-=
Dave Crocker  <dcrocker(_at_)brandenburg(_dot_)com>
Brandenburg Consulting  <www.brandenburg.com>
Tel: +1.408.246.8253,  Fax: +1.408.273.6464