ietf
[Top] [All Lists]

RE: Ietf-languages Digest, Vol 24, Issue 5

2004-12-14 06:59:25
-----Original Message-----
From: ietf-languages-bounces(_at_)alvestrand(_dot_)no [mailto:ietf-languages-
bounces(_at_)alvestrand(_dot_)no] On Behalf Of Bruce Lilly


I agree with Bruce, that accessibility of ISO 639 and ISO 3166 has not
been the issue. Unfortunately, his comments do not indicate what the
real issues were.

My comments are in response to the "New Last Call" made on
the ietf-announce list.  They are in response to the text which
accompanied that new last call and the text of
draft-phillips-langtags-08.txt dated November 2002.  The
specific claim that accessibility has been a problem was made in
the text accompanying the new last call

I don't know where the statement accompanying the announcement came from, but 
given the impression you came away from it with, I don't think it reflects the 
rationale for the proposed spec as best it could. If you read section 6 of the 
draft, it clearly indicates that the goals of the revision are to address 
issues of compatibility, stability, validity and extensibility. Nowhere does it 
even mention accessibility.

You singled out that one point to comment on as though it were the main factor. 
Accessibility was not the only reason listed in the announcement, and was not 
the first reason listed. And, as I've pointed out, was not a reason given in 
the draft itself.


RFC 3066 made reference to ISO 639-1, ISO 639-2 and ISO 3166-1; the
proposed replacement adds ISO 15924. I would count that as four ISO
standards. Up-to-date code tables for all four are readily available.

For the purpose of implementation of validation of language-tags,
the ISO 639 list includes both the 2- and 3-character codes in a
single document.  The claim (again from text accompanying the
new last call) states that there is some difference in the draft
proposal from 3066 in that 3066 (the text alleges) requires
"lists of codes from five separate external standards" -- in fact
two lists suffice for 3066 implementations.

Again, I don't know who wrote the text of the announcement, but again it is 
bringing up an accessibility issue, and mistakes the general intent and also a 
specific detail: RFC 3066 did not reference five source standards; it only 
referenced three (which you percieve as two).


I think this is a serious misrepresentation of the intent of the
proposal: the draft nowhere suggests, let alone declares, that the
source ISO standards are irrelevant.

A poor choice of words on my part. The text and draft suggests
that only the proposed new registry should be consulted, and
the draft clearly specifies that the description of all subtags is
to be provide in English (only).

It is certainly the case that only it should be consulted for determining what 
sub-tags are valid with what denotation, which was the intent.

 
Rather, the intent of the
comprehensive registry is to ensure stability...

It's not clear to me that the proposal will provide protection
against the whims of politicians.  If the definition of "CS" as
a country code changes again under the proposed scheme,
how is one to determine specifically what some archived
language-tag referred to at some point in time?

By looking in the sub-tag registry. If ISO changed the meaning of "US" to 
something other than what it is now, its meaning for purposes of use in an IETF 
language tag would not change, because it would remain stable in the sub-tag 
registry. You would be fairly well protected against the whim of politicians.



and as Bruce quite clearly pointed out, those
source standards are readily accessible. So the suggestion that
implementers will no longer have access to French-language names from
the source ISO standards simply is vacuous.

But if the proposed new registry's description of "CS" says
"foo" and the ISO standard code list says "bar", what's
an implementor supposed to present to a user as *the*
description associated with "CS"?

The *meaning* of the sub-tag is determined by the sub-tag registry. If you want 
human-readable descriptors, you already have to look beyond the ISO standards 
for anything more than English and French; it would not be new that you have to 
look beyond the registry itself to decide what human-readable descriptors you 
should provide in a product.



As for concerns of Anglo-centricity, I'm sure that the authors had no
anti-French motive, and would be open to suggestions as to how that
could be addressed.

One possibility would be two description fields.  But the
registry would need a charset closer to ISO-8859-1 than
to ANSI X3.4 as currently specified.  Or an encoding
scheme.

Personally, I don't see the value in something like that. Given the intent to 
have a registry that can be machine-readable, changing its charset from ANSI 
X3.4 in order to gain descriptors in just one more language is not worth it 
IMO. 

Speaking at least for Microsoft, we're interested in having descriptors in far 
more than two languages, and we certainly would not blindly base the 
descriptors we present to our customers solely on what a registry provides, no 
matter what its charset.



The ABNF in the draft permits all of the following tags which
are not legal per the RFC 3066 ABNF:
   supercalifragilisticexpialidoceus
   y-----
   x1234567890abc
   a123-xyz

In fact, none of these is permitted by the ABNF of the draft.

ABNF from the draft...

That means that the "grandfathered"
production (which is an alternative in the Language-Tag
production) will match any of the following text tags (comments
to the right separated by a semicolon):
   x  ; ALPHA followed by zero repetitions
   xa ; ALPHA followed by one ALPHA (see alphanum)
   x- ; ALPHA followed by one HYPHEN
   supercalifragilisticexpialidoceus ; ALPHA followed by many ALPHAs
       (see alphanum) (example previously given)
   x1234567890abc ; ALPHA followed by 13 alphanums
       (as previously given)
   a123-xyz ; ALPHA followed by three DIGITs (see alphanum)
       followed by one HYPHEN followed by three ALPHAs
       (example previously given)
   y----- ; ALPHA followed by five HYPHENs (example previously
       given)

I say the ABNF from draft -08 (quoted above) allows those;
you say no.

My mistake; I was thinking beyond the ABNF alone to other constraints imposed 
by the proposed spec.

As you know, the 'grandfathered' production is loose in the ABNF given in the 
draft, but is very tightly constrained elsewhere in the draft: it is limited to 
only items registered under RFC 1766 or RFC 3066 up to the date of acceptance 
of this proposed spec. (In fact, only a subset of those, all explicitly 
identified in the sub-tag registry.) On the date of acceptance, you will be 
able to know precisely what the valid tags that fit under the 'grandfathered' 
production are and will forever be, and it is 100% guaranteed that none of them 
will have any of the forms that seem to concern you.



Peter Constable
Microsoft Corporation

_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf


<Prev in Thread] Current Thread [Next in Thread>