ietf
[Top] [All Lists]

Re: Will Language Wars Balkanize the Web?

2000-12-05 09:10:02
Martin,

I'll send you a copy of the "@sign vs !path" debate from my USENIX papers
archive. See "Pathalias: or The Care and Feeding of Relative Addresses" by
Honeyman and Bellovin, undated, at http://www.uucp.org/papers/pathalias.pdf.

Speculations on the general utility and availability of "single" encoding
schemes or some approximation of limited ambiguity code-set mapping(s)
should not displace actual data. The claim that iso10646 is "good" is not
improved by non-reference to the costs and benefits of ASCII-colliding
encodings (EBCDIC, SJIS, etc.), just as the "interoperability" claim is
not improved by non-reference to the operational deployment of serviceable
encoding.

Ignoring the daft peculiarities of particular encodings (and ANSI C) such
as NULLs in strings (or file names), what I learned from owning the i18n
problem at Sun was that a program of code-set indepence had time-to-market,
sustaining engineering, and ease of implementation arguements over a program
of opportunistic code-set dependence (the industry standard practice), and
as a matter of convience, that the XPG/3 locale model made a utf8 locale a
minor cost item, and an interal convenience mechanism. It was a compelling
case who's hardest technical issue was dynamic character width determination
in the bottom-half of the tty subsystem.

I mention this to contrast it with substition of UTF8 (or any fixed-width
multi-octet encoding scheme) dependence for ASCII dependence, or the common
form of an addition of an "alternate code path" which affords run-time
selection of one of two code-set dependent processing mechanisms.

From my perspective, the IETF has preferred the second form of solution to
the problem since the appearence of rfc2130. See also the following rfcs:
        0373, 1345, 1468, 1489, 1502, 1555, 1557, 1815, 1842, 1922,
        1947, 2237, and 2319.

As I pointed out to you over lunch Thursday at the W3C AC meeting, the i18n
problem is not simplified by the constraint which requires reference to
iso639, or iso3166. While few APRAnauts have an evident interest in the
problem of Euro-American Americanist hobbiests getting the fundamentals of
Cherokee wrong (or care that there are three Cherokee polities), in an ISO
normative reference (iso10646), on other lists (ICANN cluttered) Americans
of sundry "liberties" pursuasions are quite worked up that Euro-American
Sinology hobbiests are not, or may not, have precedence over Chinese
governmental and cultural institutions on the operational deployment of
Chinese language elements in the DNS (CNNIC vs Verisign).

A related question is whether the i18n problem is simplified by a constraint
which requires reference to the IAB Technical Comment on the Unique DNS Root,
a constraint which adds, without reflection, the constraints of iso3166 to
the dns-i18n problem set. Again, from my perspective, several sets of critics
of the IANA transition(s), and its reluctant proponents, have overloaded the
dns-i18n problem set as either an escape mechanism from uniqueness of the
DNS root, or as a problem which cannot be solved except by preservation of
the same property (uniqueness).

Neither party appear to be motivated by the interests of users of ASCII
colliding or pre-iso10646 (et alia) encodings, or users without practicable
means to use their preferred writing (or signing) systems.

Assuming a heterogenity of end-systems, each possibly with a heterogenous
set of character encoded applications with some cut-buffer mediation
mechanism, e.g., a (encoding-neutral or encoding-preferential) windowing
system for transparent, or converted reads and write operations between
end-system resident applications, and a DNS resolver library with access
DNS service, and no additional constraints (these are enough, thanks!),
is UTF-8 _the_ compelling answer?

The attractions of Universalism still appear to be compelling, only if some
non-technical, or ancilliary service model is controlling. Unfortunately,
the utility of Particularism is temporarily hijacked anywhere near the DNS
by partizans of one convention or its converse.

If next-hop has a case for forwarding, then it is surprising that the case
can't be applied to forwarding, except for opaque datagrams.

Cheers,
Eric

P.S. I forgot to work in NATs and VPNs. Sigh.