Bruce Lilly wrote:
3. it illustrates the horribly baroque Unicode 3.x encoding of
characters making up ISO 639 language tags. One of the objections
to RFC 2047 voiced has been that it uses some additional octets
compared to raw 8-bit data -- Unicode 3.x langauage tags are much
worse as (via utf-8) they transform the 7-bit characters which
comprise ISO 639 language tags into long sequences of octets.
4. If there is a proposal to use untagged utf-8 instead of properly
tagged (RFC 2047 / 2231) charsets (including but not limited to
utf-8), then the entirety of the repercussions of such a proposal
ought to be considered, and language tagging has long been part
of MIME, but is a recent incompatible (among Unicode versions)
addition to Unicode, and one which is also incompatible (per
Unicode standards) with MIME.
At <http://www.unicode.org/review/>, the Unicode consortium has for
public review to "Deprecate the Plane 14 Language Tags".
- dan
--
Dan Kohn <mailto:dan(_at_)dankohn(_dot_)com>
<http://www.dankohn.com/> <tel:+1-650-327-2600>