ietf
[Top] [All Lists]

Re: draft-phillips-langtags-08, process, sp ecifications, "stability", and extensions

2005-01-06 08:11:22
Rather, the rule is simply that a country code, if present,
always appears as a two letter second subtag. The new draft changes this
rule,
so applications that pay attention to coutnry codes in language tags have
to
change and the new algorithm for finding the country code is trickier.

Your text above says (a) "if there is a country code in the tag, it is the
second subtag". That is not what text of RFC 3066 actually says, which is:

The following rules apply to the second subtag:
All 2-letter subtags are interpreted as ISO 3166 alpha-2 country...

That is, it says (b) "if a second subtag has 2 letters, then it is an ISO
3166 code", which is not the same as (a). (It is almost, but not quite, the
converse.)

Fine, whatever.

The current RFC certainly does not forbid the use of country
codes in other positions in language tags. One could absolutely register
en-Latin-US, for example, meaning English as spoken in the US written in
Latin script.

Sure, but my point was, is, and always has been that any 3066-compliant
implementation won't see this as a country code (unless it is table driven,
which brings up its own set of issues).

There has been a lot of noise on this issue, and too few concrete examples.

No, what there has been is a lot of discussion of a real problem with no
apparent recognition of it as such by the draft authors. Your pejorative
characterization of this as "noise" does not make it so.

In the so-called 3066bis draft, we have striven very hard to ensure that:

(c) Every single tag that could be generated under RFC 3066bis is a tag that
could have been registered under RFC 3066.

True but irrelevant.

Thus if someone wrote a parser that is future-compatible -- that could parse
all RFC 3066 language tags including those registered after the parser was
deployed -- then that parser can handle all 3066bis language tags. This is a
significant advance over RFC 3066, whose registered (not generated) language
tags are atomic, and cannot be effectively parsed at all. 3066bis adds more
structure so as to allow effective parsing of tags.

If you *can* come up with tags that would show that (c) is invalid, that
would be a concrete case that we would have to make adjustments in the draft
for.

(c) is frankly not an issue I care one whit about. (Perhaps I should, but I
don't.) I don't register tags. I write code that processes, and more to the
point matches, tags. That's why I have issues with this draft.

Moreover, all the talk about this being *too* complex is far overblown.

Again, your pejorative dismissal of other people's concerns does not
mean your position is valid.

All
3066bis language tags can be parsed, including all the grandfathered codes,
with a very short piece of code, or even with a regular expression (such as
in Perl).

Of course you can write a short piece of code to parse this stuff. It's what you
do with it after you parse it that's a problem.

This is not rocket science.

Parsing almost never is. But simply parsing these tag is not, and never has
been, the issue.

                                Ned

_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf