Oh yes, I remember now, 639 country codes for use as language tags.
Well, as promised, I spent several days contemplating the 3000+
volumes in the P section of our McKeldin library, I freely admit
being overwhelmed, but after considerable time in the stacks, and
consultation with several reference librarians, the only plausible
conclusion is that that linguists have a great affinity for decorating
the trees of the forest :-), they seem to be so busy deciding which
nodes of a tree to hang the language names on, that they dont seem
interested in simply publishing a list of the language names
themselves, no, that would be too simple, and is obviously of no
use to anyone :-0
I wonder how long it will take them to conclude that a tree-structure
isnt a good model to describe the way languages have developed?
But enough digression.
Lacking an authoritative list of language names which could be
referenced, we are faced with the choice of rolling our own, or
resorting to some sort of dodge, such as this proposal.
Rolling our own is absurd, but I still dont like the poor fit of
country code ~= language, still I have to admit that nothing better
seems available.
A quibble:
The draft makes use of the term "sublanguage" in several places.
That word is unusual, a naive reading of it left me wondering what
you intended the tags to be subordinate to, so I asked a reference
librarian here to check it out: Sublanguage isnt present in the OED,
but it is listed in the 1987 Random House Dictionary of the English
Language, (def was read to me over the phone) where it is defined
as being a sort of mini-dialect with a very narrow and somewhat
jargonistic content, I dont think you intended to use a term with
that narrow a meaning. Perhaps "dialect" would be more appropriate?
A more general complaint:
You seem to be making no distinction between written form and
spoken form (ignoring sign for the moment), these are not synonymous,
and given the existance of technology for translating written->spoken
we really ought to give some thought to defining methods for tagging
both the written form and the spoken form af text. The present
proposal seems intended to decribe written forms but is illustrated with
terms appropriate for spoken ones (ie, cockney). While there are thousands
of distinct spoken languages in our world, there only a few hundred
scripts used in writting them, and enumerating those would seem to
be a much simpler task than attempting to deal with all the dialects of
the world, especially since the dialects are a much faster moving target.
--
dana s emery <de19(_at_)Umail(_dot_)umd(_dot_)edu>