ietf-822
[Top] [All Lists]

Re: Language header already defined

1993-03-05 05:58:58
On ISO 646 inadequacy:

It fails to list 2 of the 3 languages spoken in Norway, Sami and
"nynorsk". While "nynorsk" may be considered a dialect, it is impossible
to identify this by the 639 scheme, which was worried only about the same
language being spoken differently in different countries. (Can you say
"us-centric"? ;-)
Sami is, however, a separate language that is not a majority language of
any country.

Cherokee was mentioned as another unlisted language. Gaelic, Rhaetian
and Sorbian are also unlisted, but used in various parts of Europe.

I think that the current ISO 639 covers perhaps 95-99 % of the world's
*speakers*, in the meaning that it is able to give a rough indication of
one of the languages understood by that speaker, but 10 % of the world's
*languages* (where a language is defined, more or less, by the existence
of a person who would feel insulted if his language was called a dialect
of another language) may be optimistic.

My copy of ISO 639 has 136 codes.
Other references (see end of text) have 5000 to 6000 distinct languages.

A quote from mail by Glenn Adams <glenn(_at_)metis(_dot_)com> to the ISO10646 
list
illustrates some other problems:

---- start quote

(1) Macedonian was questioned as a language in Greece.  My sources list it as
    a distinct South Slavic language with nearly 200,000 speakers in Greece
    in a 1986 census.

(2) Moldavian is not a distinct language, but a regional dialect of Rumanian.
    It should not be in ISO 639.

(3) The language listed as "Rhaeto" could more properly be called
    Rhaeto-Romance (or perhaps Rheto-Romance, Romansh, Romansch, Romanche,
    Rumantsch, Rhaetian).  ISO 639 should be corrected.

(4) Azerbaijani has recently reverted to use the Latin script, whereas it
    was previously largely written with the Cyrillic script.

(5) Sudanese should instead be "Sundanese."  I assumed the former to be
    Sudanese Arabic, and assigned it to the Sudan, with Arabic script.
    Sundanese, on the other hand, should be listed in Indonesia, under the
    Latin script.  ISO 639 should be corrected in its language name.

(6) Yiddish is still spoken in the former East Germany; however, by only a
    small number of speakers.  Consequently the countries cited would better
    read USA (1,593,993), SNC (220,000), Israel (215,000), Canada (49,890),
    to order by numbers of speakers (taken from recent census data).

The information I used in gathering this data came from a number of sources,
the principal ones being the "Ethnologue: Languages of the World," Summer
Institute of Linguistics, 1988; "A Guide to the World's Languages: Volume 1,
Classification," by Merritt Ruhlen, 1991; and "The Cambridge Encyclopedia
of Language," by David Crystal, 1987.  The first of these cites 6,170
languages, the second cites approximately 5,000 languages.
--================= end quote

But by all means, a Language: header (with room for expansion!!!!!) that
uses ISO 639 as the base standard is MUCH, MUCH better than nothing at all!

(multilingual mail - another application for multipart/alternative....)

                   Regards,

             Harald Tveit Alvestrand







<Prev in Thread] Current Thread [Next in Thread>