ietf
[Top] [All Lists]

Re: [Ltru] RE: STD (was: Last Call: 'Tags for Identifying Languages'toBCP)

2005-08-29 04:46:27
I am sorry to impose again the community, what starts amounting to ad-hominems.
Please, Brian advise if inadequate.

At 04:26 29/08/2005, Peter Constable wrote:
> From: JFC (Jefsey) Morfin [mailto:jefsey(_at_)jefsey(_dot_)com]
> The
> proposed langtag is an arbitrary limited compound of three
> information: language name, script and country. A language
> identification MAY call for far more elements, and deliver much more
> information.

Mr. Morfin has often suggested to the LTRU WG that language tags should
be able to provide greater information than is allowed by the draft. He
has never provided any specific proposal except a request to permit
certain private-use tags, which I will return to below.

Dear Peter,
This kind of repetition now abuse no one. I bored everyone enough in explaining that two additional subtags were necessary IMHO: the referent and the context. There is also - a way or another the need of the date of the reference (this can be a date or included in a subtag).

This is documented at length in a mail of mine today. I will not repeat it. I will only suggest you study Word.

The consensus of
the remainder of the LTRU WG is that the draft supports all relevant
distinctions needed to describe the linguistic and written-form
attributes of content as may be needed for all purposes, commercial and
otherwise.

This is an historic statement I hope no one will forget.
Every searcher and engineer knows the value of such final "all".

Just in case: the langtag is not supposed to only support the written-form attributes, but to be multimodal (cf. Peter Constable).
Please quote the voice, signs, icons, mood, etc. subtags.

> This means that:
> - "fr-Latn-fr" is the default tag based upon ISO 639-1/2/3
> - "x-fran" is a private use tag based upon ISO 639-6
> - "0-jefsey.com:franver" is my vision of the French at the Palace of
> Versailles. Documented by an ISO 11179 conformant system (see below)

Two comments: First, Mr. Morfin suggested within the LTRU WG that the
syntax for language tags should be loosened to permit additional
characters, such as "." and ":".

This is a false affirmation. I did two things:

- benefiting from the marvelous capacity to direct the WG-ltru decisions in proposing the necessary opposite, I made sure the ABNF would be fool proof (this is not yet exactly the case as they did not always find the proper [cf. Peter] "constraints".

- I supported the proposition of an African searcher (they treated of troll) to reconcile the desire of a strict ABNF expressed by the WG affinity group and the users, R&D and innovation (following ISO evolution) support to use the URI-tags RFC in proposing first to use the "private use" area. As indicated, a remark shown me it was a wrong choice, the private use area also addressing other needs.

I then came to the conclusion that using the present Draft as a default non exclusive solution, and some reserved numeric "singleton" as the hooks for URI-tags was preserving the work made by the WG, while addressing the needs of the rest of the world, avoiding an unnecessary conflict.

The remainder of the WG was in
consensus that this was unacceptable due to backward incompatibility
with processes designed to conform to RFC 3066.

Secondly, Mr. Morfin has repeatedly made mention of ISO 11179, a series
of ISO standards on metadata and metadata registries, indicating his
view that language tags used on the Internet should be maintained in a
registry conformant with ISO 11179, and therefore that the draft should
make reference to those standards. He has also, on several occasions
such as his comments above, cited ISO 11179 in relation to his views in
a manner that appears to be intended to suggest that his views are
superior to the draft because he has cited that series of standards
while the draft does not.

The Draft addresses targets you defined a long ago. It was presented privately (twice) and is now presented as a WG document. The document having not changed, one can expect that it keeps the same targets. You consider it addresses them "all".

There can therefore be no "superior" views. There are different targets. My target is protect the R&D, users, and Internet innovation.

In a nutshell, I do _not_ believe that a draft crafted by a few individuals can supports all the relevant distinctions needed to describe the linguistic and written-form attributes of content as may be needed for all purposes, commercial and otherwise. And I want to protect other searchers and cultures' right to have their own solutions, _without_conflict_ and detriment to _your_solution_.

The real solution is IRI-tags we will document as soon as the URI-tags RFC is published. But that will create a deployment conflict with your application, due to your sponsors. No one needs that.

A reality check is in need here:

- While Mr. Morfin cites ISO 11179, he has never made statements
  that clearly indicate that he actually understands those standards.

I propose everyone having time to spend to read ISO 11179 and to judge.

In a recent mail, Peter acknowledged the need to consider ISO 11179 and explained that ISO 12620 was its equivalent. May be the difference between an engineering and a literary approach ...

One may note that I just proposed the IETF initiates a WG in the ISO 11179 area. The reason why is that ISO 11179 has not yet engaged the networking aspects. The work we carry on CRC (common reference centers) gives a vision of interest. I often compare ISO 11179 to X.500 and the work to be carried to LDAP. The importance to the internet architectural development should not be overlooked.

I currently try to gather the necessary funding for a French AFNOR budget on the matter.

- While Mr. Morfin refers to "an ISO 11179 conformant system",
  none of the ISO 11179 series of standards contains any statement
  of conformance requirements. Thus, no such notion of "ISO 11179
  conformant" is defined anywhere.

:-) :-)

This is the second Historic statement!
Too bad there is Google ....

http://www.google.fr/url?sa=t&ct=res&cd=6&url=http%3A//www.epa.gov/sor/xml_tag_reg_gundry.pps&ei=Q38SQ-H7EpiERf-NydsK
http://www.schemas-forum.org/registry/desire/activityreports.php3?field=filename&value=JTCI_SC32_D29D35(RDF).rtf

"WG 2 intends to recommend using XML for accessing and interchanging information in 11179 conformant data registries. They expect that specific XML tags and data structure will be algorithmically derived form the normative UML data model specified in 11179 part 3. The Object Management Group (OMG) has already adopted a standard for XMI (XML Model Interchange), which they expect to recommend as one mechanism for such algorithmic derivation of XML representation from UML models. Work is also underway to foster interoperation between ISO/IEC 11179 metadata registries, XML registries, Universal Description Discovery and Integration (UDDI) registries, database catalogs, ontology registries and CASE tool repositories. The Sc 32 work is positioned to meet deeper semantic management aspects of data management and interchange. WG 2 has already initiated electronic Working group meetings to progress its program as quickly and efficiently as possible."

http://www.jtc1sc34.org/repository/0346.htm
WG 2 intends to recommend using XML for accessing and interchanging information in 11179 conformant data registries.

http://www.google.fr/url?sa=t&ct=res&cd=24&url=http%3A//www.loria.fr/%7Elandragi/publis/catalog.ps&ei=xoESQ_rcDLWgRa30gNQK
www.ncess.ac.uk/events/conference/programme/presentations/ncess2005_gillam.pdf

etc.

  All that can be said is that a
  system of metadata elements is maintained and administered using
  a certain amount of the conceptual model, practice and
  administrative infrastructure specified in the ISO 11179 standards.
  The draft uses some measure of these, though it does not make
  normative reference to ISO 11179.

This certainly explains the confusion with ISO 12620.

  In terms of ISO 11179 notions, each entry in the proposed registry
  includes the two essential components of a metadata element: a
  representation, and a data element concept. Each item in the
  registry indicates (i) the representation used in language tags,
  (ii) a designator that indicates the value meaning and that can
  also serve as the data identifier, (iii) the object class (its
  "type"), (iv) the administrative status (limited to deprecated or
  not deprecated), as well as other properties.

A simplified vision, as noted in a previous mail, is C structures. Where a name can designate a value or another structure. What is interesting in a network context is that one can add a "scheme" to the structure as the URL/IP address of the registry (URI Tags). This means that two people can build the same registry description and links, and yet they are different registry systems.

One simple application (but this is general) is to consider a language description registry root (lang root) using ISO 639-6. If instead of using entity names (for example "engl") I use the entity ID (for example an IPv6 interface ID) I can associate to the same base thousands of namelists, one for each language. At no cost.

We can also keep the IPv6 ID grid and to port it under another IPv6 address for another language. And we can replicate the documentation of the language in various languages. Default to other languages when an information is missing is easy, since we play only on IPv6 addresses with the same Identifier ID. Flexibility is total and filtering/equivalence rules can be stored as one of the data file.

The interest is that the same Item ID can be used as pointers in a local database [referents] (either loaded via a CD, or cached). We can build a local vision of a language and related information, according to personal rules. This means that we can have billions of ISO 11179 conformant descriptions based on ISO 639-6 names/IDs [context]. And to dynamically update them.

The initial database interest is that on can allocate IDs to the subtags and to langtags. But this is not documented and the work is huge (ISO 639-6 is to provide them).

Now, a langtag including the referent and the context will support interintelligibility between people the way want. Supported by an OPES people may even relate in "language" they do not know. But what is a language?


  Thus, while it cannot formally be said that the draft conforms
  to ISO 11179 (since no terms of conformance are defined), I think
  it *can* reasonably be said that the draft creates a registry and
  system of metadata elements that is consistent with the model
  presented in ISO 11179.

ISO 12620 understanding. Confusion resulting from the Varsaw meeting. ISO 639-6 can translate some minor (in term of importance) in terms from ISO 12620. That's all.

- The primary reason that the LTRU WG chose not to reference ISO
  11179 in this draft had nothing to do with whether the WG
  considered ISO 11179 appropriate or valuable in general.

Thank you for confirming that we are not interested in the same area.
Then I can only say "keep clear". Play in your own field, we will help (I made sure you ABNF is quite proof). But leave others to address their own concerns.

Rather,
  it was that it was not deemed that reference to ISO 11179 would
  add significant value in the context of an IETF language subtag
  registry. Taken together, the ISO 11179 standards are long and
  complex, and have not to our knowledge been referenced in any
  other IETF metadata registry

This is why we have to create a WG on that area. But may be premature?

 -- and certainly not in relation
  to RFC 1766 or RFC 3066, which specifications accomplish their
  purposes in spite of that absence of reference.

Thus, when I see Mr. Morfin citing ISO 11179 in the course of arguing
for some view that he holds, I consider that citation to have added
nothing of significance in support of his view.

see above.

> This means that this debate is only to lock a _final_ ABNF via an
> accepted RFC and a loaded operationalIANA registry _before_ a simpler
> solution [ISO 639-6] is available three months from now....

This statement makes several assumptions of uncertain validity, not the
least of which is that use of alpha-4 symbols from ISO 639-6 for IETF
language tags would constitute a simpler solution.

You do not need to sell your solution. I explained again and again I support it.

But do not say that it addresses my and other people's needs. It cannot be exclusive and exclude us all.

Given the widespread
existing use of RFC 3066 tags, use of ISO 639-6 would have to go
alongside use of multi-part tags of the form permitted by RFC 3066,
which is certainly not simpler than what is specified in the draft.

Draft centric assumption. Peter, your Draft is not the center of the world. The user is. Simplicity is not according to your ABNF. Simplicity is according to the user with her needs. You think (and it may very well be) that your solution is simpler for you. I even accept that in some cases it may also be for me. But your solution does not scale.

You have the same langtag capacity to document the English language of Pitcairn and of the USA. You miss half the existing scripts, do not cover founts and do not document anything of voice and signs.
Your proposition is not able to be multilingual,
You do not know what is a language in your context (and this is not easy) ....
etc. etc.

> >Your statement doesn't contradict anything that Debbie has said,
> >provided the context is ISO 639-6 alone. If we were to talk about
> >incorporation of ISO 639-6 into a revision of RFC 3066, however, then
> >duplication would become an issue for consideration.
>
> This is the WG-ltru Charter that all the ISO codes be included.

The charter makes reference to "the underlying ISO standards"; that is,
to the ISO standards referenced in RFC 3066 or those cited in the
charter to be incorporated into the update RFC. The charter does not
cite ISO 639-6, let alone state that "all the ISO codes be included".

Having considered the old failed Draft instead of the Charter did not help ....
http://ietf.org/html.charters/ltru-charter.html
"It is also expected to provide mechanisms to support the evolution
of the underlying ISO standards, in particular ISO 639-3".

How do you read "evolution"? As far as I am concerned, we want to use, help, benefit from, etc. that evolution and do not want you to block us. "US" being all of us, and in particular my own team.


> Nice to see that ISO 11179 is accepted now. Peter Constable and the
> WG-ltru have opposed the reference to ISO 11179 model. This model
> permits to conceptualise languages and to include in their
> description an unlimited number of additional elements.

This is in no way implied by ISO 11179. The model of that standard
assumes that metadata elements designate concepts within some conceptual
system, and that the system of metadata elements includes a meta-model
that reflects that conceptual system. This would have the effect of
*constraining* the concepts represented to entities within that
conceptual model. Those entities may be an infinite set, but the set of
entities that can be represented by the tags defined by this draft would
not increase in number if the draft were changed to reference ISO 11179.

You seem now to want to tag your langtags with "ISO 11179" (soon we will learn they are "ISO 11179 inside"). Good!

But this means you will have to _change_ your draft because the set of entities that can be represented by the tags defined by this draft will dramatically increase in number and in related information .... Once you have done it, we will probably say the same.

> But ISO 11179 totally open the concept...

Clearly either Mr. Morfin does not understand ISO 11179 or, if he does,
he has totally failed to express a statement consistent with that
understanding.

At this stage the reader has probably set-up his/her opinion.

> I would then advise that the Draft is sent back to the WG-ltru, with
> the suggestion that a lexicon is provided which would define what is
> a "language", a "script", a "country", and the purpose (informative,
> descriptive, normative?) of a langtag. This might be a big step ahead.

Mr. Morfin submitted a request to the WG that these terms be defined.
The consensus of everyone else in the WG was that this was not necessary
since it would not significantly alter the ability of anyone to
implement or use the specification.

May I just quote your response in another mail ....

<quote>

> I agree that the broad question of "what is a language" is out of our
> scope. The more specific question "what is a taggable language
> distinction" is perhaps more germane.

Not an unreasonable suggestion.

</quote>

Cheers....
jfc



_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf
<Prev in Thread] Current Thread [Next in Thread>