ietf-822
[Top] [All Lists]

Re: Suggest promoting Content-Language to Proposed Standard

1994-05-18 08:05:59

1. Suggestions
--------------

Two suggestions for clarification of the text in
draft-alvestrand-language-tag-00.txt:

    In the language tag:


    -    All 2-letter codes are interpreted according to ISO 639.

To aid developers, it should be pointed out that not all
2-letter codes that can be "interpreted according to ISO 639"
are found in the ISO 639 standard itself. ISO 639 provides a
registration mechanism for 2-letter codes, which has actually
been used. More on this in section 3 below.

    -    All 3-letter codes are reserved for the (hopefully)
         forthcoming revision to ISO 639

Here the words "revision to ISO 639" should be made more
precise, since two different projects to revise ISO 639 exist.
Write for example: "ISO standard for 3-letter codes".


2. Two ISO projects for revision of ISO 639
-------------------------------------------

One of the projects is the attempt to design a system of
3-letter language codes to be published as a second part of ISO
639. This project produced CD 639-2 which was rejected in the
national body vote last year. Originally it was a joint project
involving ISO TC37 (terminologists) and ISO TC46 (librarians),
but TC46 withdrew its support last spring. Since then nothing
seems to have happend. (See also my message to this list dated
"Thu, 11 Nov 93 22:11:32 +0100", with Message-Id:
<9311112111(_dot_)AA27992(_at_)mercutio(_dot_)admin(_dot_)kth(_dot_)se>.)

The other project is about revising the present (part 1 of)
ISO 639 by adding the 2-letter codes that have been registered
after the standard was published, and correcting some language
names. This project hasn't produced a Committee Draft yet, so it
will probably take one or two years before the new edition of
the 2-letter code standard is published.

JTC1, the joint technical committee of ISO and IEC which is
responsible for information technology standardization
(information coding, programming languages, OSI etc.) is _not_
involved in these activities by ISO TC37 and TC46. However, the
question of language coding of text parts has been raised
within JTC1/SC2 (character sets and information coding) by the
Swedish member body SIS-ITS, and also by J B Paterson from UK.
Two proposals about how to encode language information at any
point in a stream of plain text has been put forward, both based
on ISO 639 but using different facilities of the standard
ISO 6429 (Control functions for coded character sets).


3. New 2-letter language codes not listed in ISO 639
----------------------------------------------------

According to clause 4.2 of ISO 639 a Registration Authority can
allocate additional language codes. This authority is

   International Information Centre for Terminology (Infoterm)
   P.O. Box 130
   A-1021 Wien
   Austria
   Phone: +43 1  26 75 35 Ext. 312
   Fax:   +43 1 216 32 72

I spoke to Ms. Eva Machlek at Infoterm yesterday and she sent me
by fax an announcement of changes to the registry of language
codes made in 1989. Since then no further registrations or other
changes have been made. Here is my transcript of this
announcement. The <xx>-notation used is explained after it.


_______________________________________________________________________________

I S O   Registration Authority        I n f o t e r m
        for ISO 639 "Code for the     Affiliated to ON (Austrian Standards
        representation of names of    Institute)
        languages" (ISO 639/RA)       Heinerstra<ss>e 38 <.> Wien 2 <.> Austria
                                      -----------------------------------------
                                      International Information Centre for
                                      Terminology

Postal address: <O">sterreichisches Normungsinstitut (<O">N)
Infoterm <.> Postfach 130  <.> A-1021 Wien <.> (Austria)
-------------------------------------------------------------------------------

             ISO 639/RA   N E W S L E T T E R            No 1/1989

Upon the recommendation of the Advisory Committee (ISO 639/RA-AC) and after
consulting the national standards organizations or other appropriate
institutions according to the "Guidelines for the Registration of Languages
and their Symbols", changes (additions, deletions, revisions) have been made
as below. Please note that a vacated symbol will not be reinstated for a
five-year period.

(English                (original name     (instead of) (symbol)  (instead of)
 language name)          of language)

_New symbols included in ISO 639_

Uigur                   Uygurqe                            ug
Eskimo
   Inuktitut CA         Inuktitut                          iu
Zhuang
   (superseded: Chuang) Saw Cueng                          za

_Symbols changed in ISO 639_

Hebrew                  <`>I<b_>rit        (Iwrith)        he        (iw)
Yiddish                 Yidi<sv>           (Jiddisch)      yi        (ji)
Indonesian              (Bahasa) Indonesia                 id        (in)

_Changes of spelling of original language names in ISO 639_

(English                (original name     (instead of) (symbol)
 language name)          of language)

Abkhazian               Apswa              (Abkhazian)     ab (unchanged)
Arabic                  <`>Arab<i->        (<`><A`>rabi)
Kashmiri                Ka<c,>miri         (Kashmiri)
Azerbaijani          Az<e.>rbaj<gv>an<gv>a (Az<e.>rbajganga)
Corsican                Corsu              (Cors<u->)
Sanskrit                Sa<m.>sk<r,>ta     (Sanskrit)
Greek                   Ell<e->nik<a'>     (Ellinika)
Croatian                Hrvatski (jezik)   (Hrvatski)
Breton                  Brezoneg           (Brez)
Serbian                 Srpski (jezik)     (Srpski)
Serbo-Croatian      Srpskohrvatski (jezik) (Srpskohrvatski)
Slovak                Slovensk<y'> (jazyk) (Slovensk<y'>)
Slovenian               Slovenski (jezik)  (Slovensci)
_______________________________________________________________________________


Note: Under the heading "Symbols changed in ISO 639" also two
changes of original language name is indicated.

Explanation of the special notation used:

Symbol  UCS repr  ISO character name
------  --------  ------------------

<ss>    00DF      Latin small letter sharp s
<.>     00B7      Middle dot
<O">    00D6      Latin capital letter O with diaeresis
<`>     02BF      Modifier letter half ring
<b_>    1E07      Latin small letter b with line below
<sv>    0161      Latin small letter s with caron
<i->    012B      Latin small letter i with macron
<A`>    00C0      Latin capital letter A with grave
<c,>    00E7      Latin small letter c with cedilla
<e.>    0117      Latin small letter e with dot above
<gv>    01E7      Latin small letter g with caron
<u->    016B      Latin small letter u with macron
<m.>    1E41      Latin small letter m with dot above
<r,>    0157      Latin small letter r with cedilla
<e->    0113      Latin small letter e with macron
<a'>    00E1      Latin msall letter a with acute
<y'>    00FD      Latin small letter y with acute


--
Olle Jarnefors, Royal Institute of Technology, Stockholm 
<ojarnef(_at_)admin(_dot_)kth(_dot_)se>