ietf
[Top] [All Lists]

IANA XML registries

2006-09-20 18:46:49
At 20:38 20/09/2006, David Conrad wrote:
The implication of this is that the entire registry would need to be
fetched, right?  Given I'm told the registry under discussion is a
bit on the large-ish size, this might be somewhat problematic.  Can
you give me an idea of the anticipated scale of the number of
applications that will be wanting to fetch this registry?

David,
let think in terms of network for once (IETF is about networking). You have currently 500 millions users. I think a conservative figure for the ten years to come we discuss, is a Multilingual Internet is 2 billions CPU. These CPU must be able to check if they are up to date, on a regular basis. Best practice would be at connect time and once every day.

The IETF registry adds complexity (extlangs, comments, changes) on top of the ISO 639 series. The ISO 639-3 7,500 items table will not be printed because it will be constantly maintained (please, Peter, can you comment). This will be the same (and probably more ) for the ISO 639-6 tables which may range in between 15 and 30.000 items (please, Debbie, can you comment). Peter commented that one or several updates a week is most probable.

This means an updating scheme comparable to a DNS root system with 40.000 TLDs. You probably have the figures of the rs.internic.org to compare with, plus the access to the RSSAC root servers. Except that the root file is 65 K and we talk of a file probably 100 to 10,000 larger. And that the root file is supported by the DNS protocol (the root servers are only occasionally called upon by users - in case of a TLD typo or when their ISP nameserver has not yet/no more information on a called TLD).

Fetching the file would be like carrying an axfr of the root or an FTP access to rs.internic.org. A DNS like protocol I proposed in vain could permit to decrease, organise, and distribute that load. Langtags are probably like TLDs: some will never be queried in some geographic areas. Many will have a very long TTL (reasonably equivalent to root real life TTL). But many reasons may call for a shorter system TTL.

Obviously caching the registry at the user end would help a lot (each user would probably need a limited number of languages to be documented [those in his filters and those in his relational area]. The others representing pollution to him.

However, this must be considered in an interoperable context with other language codes. RFC 4646 does not care about interoperability, but "x-tags" are enough constrained to permit to build a partial strategy based upon 8 alphanum tags + signature.

These private tags will need to be verified and validated the same. Either you support them and queries go to you and your mirrors. Or you don't and necessarily queries will go to private resolvers (much like for the DNS root, so an equivalent top system load). I initially explained that langtags had to document referents to be of interest to computers and extended services (to the content). Now, a simple format for private langtags can be x-8 alphas (three for languages, 2 for script, 2 for country, one by referent in that context - 36 possibilities is large enough until RFC 4646ter which will adapt) - a langtag private library signature. The size of the central registry will be quite large with more information. But the checking traffic can be limited to a warning on changes through the distribution of a regex of the 8 alpha change and a compacted date. This means 12 alphanum per change announcement. This may mean a 100 to 2000 chars message a day at login time, obtained by the ISP from the IANA. This information could even be added at the root footer, since this information is already downloaded by ISPs.

We considered these avenues and others for the MDRS project, and tested them. They call for crosswalks with different standards/codes outside of the Internet (languages are not restricted to the Internet).

I hope this gives you some elements.
jfc












_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf
<Prev in Thread] Current Thread [Next in Thread>