Re: Notes on draft-homme-sieve-variables-04


On fre, 2004-10-22 at 16:11 -0400, Mark E. Mallett wrote:

These are a few comments on draft-homme-sieve-variables-04 .  In another
message, I made some comments about variables in general:  specifically
that I'd like to see a time where variables are integrated into the
SIEVE language in a more fundamental way.  However I imagine that even
assuming anyone else agreed with me, that goal would take some time to
reach, and that something like this draft would be more immediately
useful.  In making allowance for that farther goal, I'd want to see:

 - This draft and its capability called something other than "variables"
   (maybe "stringvars")


if the capability is integrated, why does the name of an obsolete
extension matter?

 - The things that this draft calls "variables" be called something
   else (like, "stringvars").


I can try to add a definition of "variable" to make it clear.  The
introduction now reads in my personal copy:

        Conventions for notations are as in [SIEVE] section 1.1,
        including use of [KEYWORDS] and [ABNF].  The grammar builds on
        the grammar of [SIEVE].  In this document, "character" means a
        [UNICODE] character, which may consist of multiple octets coded
        in [UTF-8], and "variable" is a named reference to data stored
        or read back using the mechanisms of this extension.

   When a string is evaluated, substrings matching variable-ref SHALL be
   replaced by the value of variable-name.  Only one pass through the
   string SHALL be done.  Variable names are case insensitive, so "foo"
   and "FOO" refer to the same variable.  Unknown variables are replaced
   by the empty string.


I'd strongly prefer case sensitivity here.


almost all Internet standards are case insensitive, and I don't think it
is important enough to go against that flow.

You might want to mention that "identifier" is as defined in rfc3028
(even though this is an extension- it doesn't hurt to be explicit).


the excerpt above addresses this.

There's not really a lot about namespaces in this draft, other than to
allow for future state variables associated with extensions (or so it
appears).  Maybe point out that this document specifies namespace syntax
only, without addressing anything else about namespaces.


- Future extensions may make internal state available through variables.
+ Namespaces are for future extensions which make internal state
available through variables.

The implication
seems to be that namespace-associated variables are read-only; if that's
true, might want to make that explicit.


I don't think we want to restrict them to be read-only, but there needs
to be a new action if they are to change, since SET disallows setting
variables in namespaces.  one use could be a "scope" extension, so the
user says something like:

        LET "scope.local.myvar" "true"

   The expanded string MUST use the variable values which are current
   when control reaches the statement the string is part of.


At least one problem with this has already been brought up on the list:
the interaction between match results and sequential evaluation of
multiple tests.  It's much more expressive to be able to take advantage
of the side effects (the match results) within one test statement;
furthermore it's burdensome on an implementation (especially in terms of
efficiency) to have to freeze the current match results upon entry to a
test statement so that those frozen results will be available by each
step inside that test statement.  I suspect that the goal of this
prescription is to address deferred actions such as "fileinto" -- and
that's a good thing.  However, it does introduce these other real
problems.


the main issue is whether the results are defined or not.  there are no
other natural sequence points than the complete statement.  if this is a
to change, significant amounts of thought has to be put into defining
execution order.  (I'd be glad to be proved wrong, of course.)

I suppose this is as good a place as any for this comment:  I'm not all
that comfortable with all strings being automagically eligible for
interpolation (once this extension's capability is enabled).  It seems
to me that it would be better to have a syntax that specifically
commands that the string be processed.


we have that, if the string contains "${", it should be processed.  a
bytecode compiler can do this trivially, and store the two types of
strings differently.

e.g., with some character before the opening quote:

    fileinto ?"${1}";

or via other different quoting style.  A bonus would be a syntax that
allows one or more rescannings of the resulting string.


I'm not sure that is a bonus :-)

   For ":matches", the list will contain one string for each wildcard
   ("?" and "*") in the match pattern.  Each string holds what the cor-
   responding wildcard expands to, possibly the empty string.  The wild-
   cards expand greedily.


I've lost track of the history: what's the reason for "*" expanding
greedily?  Is it to be compatible with what regex does?


yes, I think it would be strange if :matches was non-greedy,
while :regex was greedy.  you may be right that the improved usability
is worth it, though.

incidentally, if we had a "reverse" operator, users could implement
non-greedy themselves:

        set :reverse "pattern" "[*] *";
        if header :matches "Subject" "*" {
           set :reverse "subject" "${1}";
        }
        if string :matches "${subject}" "${pattern}" {
           set :reverse "prefix" "${2}";
           set :reverse "actual_subject" "${1}";
        }

   Numbered variables ${1} through ${9} MUST be supported.  References
   to higher indices than the implementation supports should be treated
   as a syntax error which MUST be discovered at compile-time.


I don't like the mandate that strings have to be inspected at
compile time-- a "MAY" would be preferable.


I don't like the possibility of an error spuriously occuring during
delivery.

   The introduction of variables makes advanced decision making easier
   to write, but since no looping construct is provided, all Sieve
   scripts will terminate orderly.


"orderly" is not an adverb :-)


strangely so.

-- 
Kjetil T.