On Fri, 2005/03/25 (MST), <info(_at_)utel(_dot_)net> wrote:
there is a "detail" wich was not considered and I focus on: it is the
documentation of the language in which the text on wich the rules are to
apply. And also in which language the program is to be entered. No use
to have a P or Sieve language in ASCII for an Arabic, a Chinese, a
Russian, etc. e-mail exchange.
I do not know how Sieve addresses this, but if we go with P, I would argue
that P should use UTF-8 for encoding. Would using UTF-8 to write rules
address the above concern?
A language is documented by a langtag. I was among those who blocked the
second last call of the proposed RFC 3066 revamp and obtained a WG-ltru
to consider it. One of the major point of contension was the refusal of
the authors to consider the OPES requirements. Like Web Services, we
need to have a clearly defined language to filter. This concerns both
the header (lingual name, subject, etc.) and the content. The author of
the Draft (W3C and Unicode) are interested in two main things as far as
I understand: documenting the language of the page for HTML and XML and
defining the language as part if the UNICODE CLDR effort to define all
the locales of all the OSes.
Are you talking about detecting the encoding and/or language of various
message parts? If yes, then I think the nobel efforts above are outside of
the rules language core. There should be a mechanism to check what
encoding/language is used, but how that information is stored in a message
[part] is pretty much irrelevant to the core of the rules language, right?
Thanks,
Alex.