0.2 - Open Issues.
I'm curious why "fileinto" is being considered for moving into a separate
document. Maybe I missed an earlier thread on this.
"fileinto" needs to be a separate, optional facility becase many
implementations will not be able to support it. It also isn't something that's
going to be at all easy to write a security analysis of (something the IETF has
gotten a lot pickier about lately).
2.7 Evaluation
The second paragraph mentions the possibility that implementations may
impose restrictions on the number of actions per message.
I think this is a bad thing. While I understand that one of the design
goals is to reduce the possibility of using the mechanism for mail-bombing,
I want to point out that there is sufficient existing practice to allow
such actions as auto-filing in conjunction with auto-forwarding.
Like it or not, this is going to have to be allowed. ISPs are not going to be
willing to deploy something that cannot impose such limits.
With a restriction like this, the sieve becomes significantly less useful
for many of the users that are most likely to use it the first place -
users adept at email.
The entire point of this exercise isn't to develop something for users adept at
email. There are dozens if not hundreds of languages already available for this
purpose -- I wrote one and released it into the public domain back in 1984, and
my work was based on on the documentation of similar systems that had been
around for years.
The point here is to develop something that every MTA and message store will
want to implement and that can easily be used by various automatic rule
generation utilities. This means that the language cannot require the presence
of constructs that are difficult or impossible for everyone to implement, nor
can it require that limits obviously needed to combat spam cannot exist.
6. Errors in Processing a Script
The stipulation that implementations SHOULD NOT try to recover from a
script with errors is a problem for me. Aborting within an 'if' clause
makes sense to me, but to totally stop filtering if any error is encountered
is the wrong thing to do. I would venture a guess that most users would
consider this a very bad characteristic of the mechanism.
The problem with tring to recover is that you cannot be sure of what the user
meant and you may end up doing the wrong thing. And in this case the wrong
thing may mean losing mail.
However, as a purely practical matter I think this is largely a moot point,
since I expect most implementations to perform syntax checking at the point
where rules are added. I know my implementation will do this.
On the readability front,
I would like to see an optimization made to the grammar. In the elements
of an if-clause condition that use a list as an argument, I would like to
see the ability to not necessarily have the parens for single-item lists.
I don't have a problem with this.
It might also be nice for users to not have to use quotes around words that
don't need them.
This is highly problematic, for the simple reason that it may have a very
adverse effect on future extensibility. The ability to distinguish between a
string argument and, say, a function that returns a string, is crucial if we
want to keep the parser implementable with single-token lookahead and the
language backwards-compatible.
Here's an abbreviated version of the example in 2.5 to illustrate:
if any-of (header ("from")
contains ("bart" "homer" "smithers" "burns" "lisa"),
header ("subject") contains ("URGENT")) then
fileinto "INBOX"
endif
Elimitating spurious parens:
if any-of (header "from"
contains ("bart" "homer" "smithers" "burns" "lisa"),
header "subject" contains "URGENT") then
fileinto "INBOX"
endif
Eliminating sprurious quotes:
if any-of (header from contains (bart, homer, smithers, burns, lisa),
header subject contains URGENT) then
fileinto INBOX
endif
And what happens if I want to add a function bart to the language?
The rule for whether or not quotes were needed would be based on avoidance
of conflicts in the grammer (eg: whitespace and commas).
I think this behaviour is more novice-friendly.
I disagree that this is more novice-friendly. What this does is introduce
an inconsistency into the language, and inconsistencies are the single
worst enemy of anyone unfamliar with the language.
It is instructive to look at past language designs in this regard. In the
Praxis language (and to a lesser extent Pascal), for example, the use of
semicolons to delimit statements was deemed to be unfriendly to novices. But
they couldn't be eliminated entirely. The resulting rules for when they were
needed and when they weren't were so complex that nobody could figure them out,
and were found to be a serious hindrance to learning and using the language.
Eventually the optionality of semicolons was effectively abandoned by
language users.
But again this is largely moot. This language design from the outset is focused
on ease of use by generating utilities, not on ease of use by novices. If
it were engineered primarily for direct use by novices it would need to
change in some fairly substantial ways from what we have now.
Ned