Re: OPES Rules Language
2005-06-09 13:24:29
At 19:33 09/06/2005, Alex Rousskov wrote:
On Thu, 2005/06/09 (MDT), <info(_at_)utel(_dot_)net> wrote:
At 18:55 09/06/2005, Alex Rousskov wrote:
On Thu, 2005/06/09 (MDT), <info(_at_)utel(_dot_)net> wrote:
Some difficulty to understand how you trigger the adaptation then?
The proxy/processor configuration will have a mechanism to specify
what rules/code to apply and where to apply them. Different processors
will have diffierent invocation points and different specification
mechanisms (e.g.,access control lists or hard-coded triggers).
This does not make the language universal.
No language can be universal. Think about it. Somewhere the language
scope ends and the language environment scope begins. For example,
C++ standard does not specify when C++ programs are executed while
shell scripts do not care what language the programs they run were
written in.
The reason to remove invocation points from the rules language
is simple: all existing proxy implementations already have their
configuration language that determines invocation points. Usually,
it comes in a form of an ACL of some sort. Apache, Cisco, NetApp,
sendmail, etc. all have that. Trying to change or replace that
language is fruitless, IMHO.
On the other hand, providing a universal language to describe what
(if anything) happens at the selected invocation point does have a
[slim] chance of being deployed because no popular implementations
(related to HTTP and ICAP) have any good knobs for that. SMTP world
has Sieve and Milter.
Let me think over that and see the implementation.
Objection to have both?
Obviously I can add them but that would be bad if the language ends to
be in an ISO standard what would be great.
Sorry, I do not understand this part. Perhaps you can give a specific
ISO requirement you are trying to satisfy?
We (Internet and the world at large) have a problem with language
identification. At IETF, W3C and ISO there are roughly three approaches
which are fiercely disputed because the people supporting the first two
have not (yet) a network vision and there is a lof of money in the first one.
- one is from publishers point of view (books and programs). They start
from characters (Unicode), to computer (locales), pages (XML, HTML), to a
IANA language registration stewardship giving the registrant a commercial
advantage. That approach uses concepts ("English", "Latin Script", ccTLD
for country/market, etc.). This is OK to classify items in a cupboard or in
a directory. This is operational. It should extend from 400 to 7500
languages. This should be ISO 639-3 and okayed in August.
- another is ISO standard consistent. ISO 12620 defines the data elements,
ISO 11179 defines the registries. We are talking of precise rules. A script
is defined by its charset, a language tested by statistics on recurrent
words, etc. We are no more working on concepts but on values which can be
used by computers (and OPES to test the language of an unknown
message/page, and massage it - translating, entering notes, classifying it,
etc.). The base used in that project includes 20.000 languages. This should
be ISO 639-6 and big work ahead.
Both use ASCII string IDs for languages (2 or 3 for the first approach, 4
for the second one). None of them can support multilingualism. The first
one is deliberately ASCII and English oriented (the table) and wants to
replace RFC 3066. The second is fully open to multilingualism but work on
multilingualism has not been carried yet in ISO 11179. It is supported by
Govs, R&D, etc.
- the last one (my team) starts from the smart-user-in-a-network's point of
view. We cannot accept the first one only. The second one is OK but with
some conceptual addition and a lot of wording simplification - this is
complex if we want to stay fully compatible with metamodels we could have
to hook. We will give all the language IDs a number (an additional column)
and we index twice. So we can come from the ID number, or from a string. We
make the numeric language ID an IPv6 Interface ID and the string language
ID a domain name. Access is therefore very fast and proven. When we call a
language (by IPv6 or by name) we call the registry of that language. The
response we get give us all the registered details of the language (it can
be an XML page, an ASN.1 structure, or we can add the sub-address of one
element to get only one word). Addresses can retain the information or to
be recursive and point to another address if the information is common or
in case of update. This concept should be documented hopefully in ISO 639-4
this year or after having been demonstrated next year.
My idea is to say the Sieve/P language is a language to parse a text. Let
stabilize its commands as a metamodel (XML fields) conforming to the ISO
11179 standard (it is OK no problem - but careful wording). This means that
we will have documented a universal way for a user in every language to
enter a script to work with his texts. This kind of thing is developed in
lexicons already so the concepts and tools are there. It may take time to
fill all the entries (actually we need 100 entries for each script and then
to refine - wikipedia could help a lot): this would be the first
multilingual computer language.
This means that any user, on any keyboard, could enter an save a script to
have an OPES working on his language, this being documented in ISO standards.
Now you realise that the OPES will use a CRC (context reference center, the
system where we will locate the registry) as a call-out server. This CRC
can belong to a mailing list for example. It can document a vernacular
version of the language being used (dictionary, ontology, syntax, grammar,
etc.). For example each time we would copy someone outside of the OPES
mailing list and we use "OPES", the system could add in the language of the
e-mail a foot note explaining the destinee what an OPES is and possibly
correct my Franglish in readable English.
Was I clear enough?
For that to work, I think we need to have a simple enough language neutral
to ASCII strings (commands and processed text). This is also why at this
stage I would be happy to keep triggers. There are enough tricky things
which may happen in language logic (bidi for example) to feel more
confortable if you have more control.
Obviously if it does not work I will drop the request ... but it would be
great if it did: for the service provided, for the fun, for the first attempt.
jfc
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- OPES Rules Language, The Purple Streak, Hilarie Orman
- Re: OPES Rules Language, Markus Hofmann
- Re: OPES Rules Language, Alex Rousskov
- Re: OPES Rules Language, jfcm
- Re: OPES Rules Language, Alex Rousskov
- Re: OPES Rules Language, jfcm
- Re: OPES Rules Language, Alex Rousskov
- Re: OPES Rules Language,
jfcm <=
- Re: OPES Rules Language, Alex Rousskov
- Re: OPES Rules Language, jfcm
- Re: OPES Rules Language, Tony Finch
- Re: OPES Rules Language, Alex Rousskov
- Re: OPES Rules Language, The Purple Streak, Hilarie Orman
|
|
|