ietf
[Top] [All Lists]

RE: draft-phillips-langtags-08, process, sp ecifications, "stability", and extensions

2005-01-06 18:26:47
In a nutshell, Ned was elaborating on a comment from Dave Singer that,
once we have parsed a pair of tags and identified all the pieces, it's
not a trivial matter to decide in every case how the two tags compare,
and that there are factors that would exist if the draft were approved
that didn't exist under RFC 3066.

Finally! Thank you! This is exactly what I have been trying to say.

Again, I think this is a question that deserves discussion. In relation
to the proposed draft, I don't see it as a particular problem with the
draft. It is a problem that doesn't exist in RFC 3066, but that is only
because RFC 3066 left us with bigger problems: it doesn't give us any
way to identify pieces that we would be encountering in registered tags
(apart from hard-coded tables compiled from versions of the registry
that pre-exist a given implementation).

With, as you point out below, one important exception: It did have a way to
reliably identify a country code in most cases (but not all). And this ability
to say "2 character subtag in the second position, most be a country code" was
quite useful even though it might miss other occurences of country codes in
some cases.

3066bis provides a reliable way to locate country codes in all cases, but the
algorithm is different. And this is a non-backwards-compatible change.

Of course there's the option Dave Singer has raised: Reverse the positions of
script and country codes in 3066bis. I see two problems with this:

(1) Script codes are in general more important than country codes, and
    therefore really should come first so that simple truncation matches
    work "better". (There are probably exceptions to this assertion lurking
    out there somewhere, but I believe it is mostly true.)

(2) I believe it increases the number of grandfathered codes that won't conform
    to the new format.

Now, it may be that, after full consideration of all the issues, especially
given that the 3066 algorithm could not locate country codes in all cases, the
right way forward is to make this non-backwards-compatible change, fully
document the change and its consequences (although I will again point out that
assessing the true impact on the installed base is a practical impossibility),
and move on. But as you say, it does deserve discussion.

RFC 3066 permits tags that have all kinds of internal structures. That
is a problem as it will never allow us to derive much useful information
from a tag with any confidence -- only the ISO 639 language category and
in some cases a country category. I predict that in the future we will
be seeing a significant number of tags (whether sanctioned without
registration by a successor to RFC 3066 or as tags registered under RFC
3066) that go beyond the patterns 'll(-CC)" and "lll(-CC)". If we stick
with RFC 3066, we will have no way of writing forward-compatible
processors that will be able to do very useful matching.

A very good point.

What this draft does is impose some order to all the other patterns
within  tags that are permitted, and tell us what the different pieces
must be. As a result, we have more named pieces to deal with, and we are
presented with the question that Ned raised: "Now we have more named
pieces than we did before; what do we do with them?" That is a problem
that will need to be addressed. But I don't think it's a reason to
oppose the draft, since opposing the draft (or at least opposing any
revision that introduces a richer internal structure) leaves us in a
situation that must be characterized either as a worse problem or as
turning our backs on increased functionality to meet real user needs.

What would be really nice is to specify a parameterized matching algorithm (or
more precisely, an algorithm family) along the lines of the stringprep family
of string normalization algorithms. But I'm unsure if there's sufficient time
and interest available to do this. But it is nice to dream...

                                Ned

_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf


<Prev in Thread] Current Thread [Next in Thread>