Harald T. Alvestrand writes:
The syntax of this header is:
Content-language: <2xAlpha>[_2xAlpha] (comment) [ , ... ]
I would prefer '-' rather than '_', mainly because I see future
possibilities for using these language tags in text/enriched
to get around some of the unification headaches of ISO-10646
(no, I'm not suggesting we do it now), and '-' fits in nicer
with the command syntax of text/enriched.
Masataka Ohta writes:
I have heard that there are about 4,000 to 8,000 languages (neglecting
dialects) in the world.
If so, I think your scheme is no good.
I wouldn't go that far.  The 2-letter codes he is using are in the various
ISO standards for such tags, so it is a good start, building on previous
efforts.  It also lets ISO worry about wrangling with future names, and takes
the burden off the IETF's shoulders.
I suggest though that the tags not be limited to 2 letters.  Put in some
weasel words to the effect that initially the first tag may have any of
the 2-letter codes set out in ISO 639, and that it is expected that a future
update to ISO 639 will define additional tags which will also be valid
tags for Content-Language purposes.
The country code isn't a problem IMHO since those codes change very
infrequently, but removing the 2-letter limitation "just in case" can't hurt.
Overall, I like this proposal: there's just a few sharp edges to file off. :-)
Cheers,
Rhys.