Re: Last Call: language root file system

Dear David,

I am afraid this debate leads to nowhere because you supposeimplemented solutions instead of considering how to implement them.Saying "operating systems will start providing ... so that only an OSupdate ....".

Everyone would be happy if the DNS was supported by the OS (MS saidthey would support IDNA in Windows should the market shows it theproper thing to do ....). Louis Pouzin defined the mail concept in64, Tom Van Wleck developed it in 65. The first spam was in 67 (askTom). Since then there are people saying they have the anti-spam solution...


However, you have a solution: Unicode. Why not? But you need to document it.

At 00:23 28/08/2005, David Hopwood wrote:

JFC (Jefsey) Morfin wrote:

At 18:11 27/08/2005, David Hopwood wrote:
JFC (Jefsey) Morfin wrote:
[...] The DNS root is updated around 60 times a year. It islikely that the langroot is currently similarly updated with new langtags.
No, that isn't likely at all.
Dear David,
your opposition is perfectly receivable. But it should be documented.


For the long-term, sustained rate of updates to the registry to be 60 a year,
there would have to be real-world changes in the status of countries or in
the classification of languages and scripts that occurred at the rate of 60
a year (i.e. every 6 days). And even in times of significant political
upheaval, that is simply implausible.

Please stop removing the responses I already gave. I documented thatthis rate, for a database meant to add all what is missing, is a veryvery low rate. This is not in denying reality that you build yourselfa credibility for your proposition. You should document the number ofyearly changes in the Unicode files: consulthttp://www.unicode.org/versions/ - there are 20 versions on-lines.

The order of magnitude is the same. I did not note the number ofentries in the IANA file during the last months. This is somethingthat I will certainly maintain if the registry stabilises.
Exactly; the registry has not stabilised. It will do, but until it does,
there is little point in arguing statistics on how frequently it is updated.

??? Nobody is arguing, are you? There is a problem which is to beassessed, documented and addressed. However a BCP the Draft does notdocument this (this was discussed, not addressed and decided as outof scope). The responsibility has been left with the IANA. The IANAis all of us. This is for the IESG to decide, and to bear the responsibility.

I documented that stabilisation means tens of thousands of entries.When (if) such a stabilisation occurs, the problem of size will bestill more important.

The langtag resolution will be needed for every HTML, XML, emailpage being read.
Patent nonsense. In practice the list will be hardcoded intosoftware that needs it, and will be updated when the software is updated.
Then? the langtag resolution is the translation of the langtag intoa machine understandable information. It will happen every time alangtag is read, the same as domain name resolution is neededeverytime an URL is called.
The langtags would already be encoded in a form that can be interpreted
directly by each application.

I do not understand what this may mean? The Draft is about that. Whatis discussed here is the update of each application.

You were trying to imply that repeatedly downloading thisinformation would impose significant logistical costs:
# Even if the user cache their 12.000 to 600.000 k zip file when they boot,
# or accept an update every week or month, we are in the logic of an
# anti-virus update.

I try to imply nothing. I document that applications (many possibleapplications) have the need to access data from a big database. Weneed the simplest, most secure, least costly, most stable, most opento innovation, fastest and most efficient way to give that access.Because I am among those who will pay for it and who will be blockedif it does not work, I feel concerned as I see no credible solution.

I am not interested in your "no"s, but in your/IESG "how"s I could behappy with.

In fact there is unlikely to be any additional cost apart from that of
upgrading software using existing mechanisms.


Like updating mail servers to anti-spam solutions, ISPs to IPv6, IE to IDNA ?

This is perfectly sufficient. After all, font or characterencoding support for new scripts and languages (e.g. support forUnicode version
updates) has to be handled in the same way.
I am afraid you confuse the process and the update of the necessaryinformation. And you propose in part the solution I propose :-) .
If it is sufficient to upgrade software using existing mechanisms,then there is no problem that is not already solved.


OK. you imply the Unicode solution is your solution?

 Languages, scripts, countries, etc. are not domains.
The DNS root tend to be much more stable. What count is not thenumber of changes, but their frequency.- there is no difference between ccTLDs and country codes. Weprobably can say that there is one change a year. At least.
What happens if the change isn't immediately picked up by all software?
Not much. Only use of that particular country code is affected.

Thank you for the "not much" for the affected country code. Let saythat some time we will have to switch from "uk" or "gb" to "en", withall the changes it would mean. Not much a problem if only England is affected ?

The same, no big deal if the DNS root is not kept updated? OS couldupdate it in computers every now and then (ICANN has sometimes fourmonths delay)? So, let suppress the root servers: I do not object tothis, but I wish to know if this is your proposition?

I proposed to use the DNS to support that information. This wasopposed. Do you think your Unicode-like solution is better?

[...]
Now, if there are updates, this means there are needs to use them,now - not in some years time.
And if they do, they will upgrade their software -- which is what they
have to do anyway to actually make use of any new localisations, scripts,
etc.

The problem is not with people upgrading. The problem is with theservers providing this upgrade. If all the current users upgradedtheir current langroot file once a year, over the year, in trying toavoid any peak, no error, no DoS, etc. This would represent today 400K a second.

In reality this would represent at least be 4 Meg peaks. Up to IESGand to IANA to say they can take the load and the risk. Sizeincrease, frequency increase, DoS risks, probably call for 400 Megs.

Again DNS would most probably dramatically distribute the load. CRCs(Common Reference Centers) I work on are to support this withoutproblem, and add a lot added value. But it seems the opposition wehave is here. The authors favor, like you, a Unicode oriented solution.

PS. The problem is: one way or another one billion users, withvarious systems and appliances must get a reasonably maintainedrelated information which today weight 15 K and is going to grow to600 K at some future date,
The subset of the information needed by any particular application will
typically be much less than 600K. If there is a real issue of database size,
operating systems will start providing shared libraries to look up this
information, so that only an OS update is needed (and similarly for the
Unicode data files, which are already significantly more than 600K).

This is like saying that the subset of DNS information a use willtypically need is much less than the hundred of millions of FQDNs.

Do you mean that if there is a problem, there will be langserversystem developed in a rush outside of any standard, so thestandardizers are not to bother? And that, in any case, we can copy Unicode?

Then I would be interested in the Unicode solution? In the trafficdata and user support of Unicode? Of the Unicode distribution system?Of the way people maintain their Unicode applications (I supposethere is a reason why they maintain 20 releases online)?

But I am afraid you confuse the usage of Unicode data and of languageidentification data. This is not a table, this is a network crossnegotiation, calling for far more information. The currentlydiscussed Draft on filtering/matching shows the versatility usersshould obtain...

with a change from every week to every day (IMHO much more aspeople start mastering and adapting a tool currently not muchadapted to cross lingual exchanges). From a single source (inexclusive case) or from hundreds of specialised sources in an openapproach. This should not be multiplied by all the languages thatwill progressively want to support langtags, but will multiply theneed by two or three.For example an Ukrainian will want langtags inUkrainian, in Latin and Cyrillic scripts [...]
You pick one of the very few languages that are written in more than
one script, and use that example to imply that the total number of
language-script combinations used in practice is 2 to 3 times the number
of languages. Please stop exaggerating.

??? This is an odd remark! When you develop a solution, you do nottarget supporting needs a minima. You support a maxima. And thisexample is not really a big one. Just think what can be the demandfor an international multilingual directory....

The question is to know if the Unicode solution can be adapted to theIANA or to replace it ,if needed? if it scales or cost too much.Without making the Internet dependant of commercial solutions? Willyou warranty this?


jfc


_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf