Re: Questions regarding RFC 5228

Stephan Bosch writes:
> Hello,
>
> I am finishing up a first release of my Sieve implementation, and one
> of the TODO items that yet remains is getting some answers to
> questions that arose during development. I've collected these into a
> file an now I submit them to this list to get some clarification. Any
> help is greatly appreciated.
>
> * RFC 5228 (Sieve) : 5.1.  Test address:
> "Implementations MUST restrict the address test to headers that
> contain addresses, but MUST include at least From, To, Cc, Bcc,
> Sender, Resent-From, and Resent-To, and it SHOULD include any other
> header that utilizes an "address-list" structured header body."
>   -> Will this cause a compile error, or are the disallowed headers
>  simply ignored? My implementation currently considers this to be a
>  compile error.

So does mine.


Our implementation does not. We allow address tests on all possible header
fields. The test fails if the field does not contain actual addresses.

IMNSHO what you are doing here is quite simply wrong. It might, and I emphasize
MIGHT, be OK to throw an error if an address test is done on a really well
defined field that is either unstructured or structured as something other than
an address list. (Although I see no point in doing that.) But for an unknown
field, you cannot assume that it isn't defined by something somewhere as an
address list. Even if you assume you can keep up with the fields people bother
to register (and if you think you can I have to wonder how you get all the
sites using your software updated to the new definitions in a timely way),
people define an use their own private fields all the time. And nothing we've
done or can do in the standards process is going to change this.

And what about fields that have address syntax but which aren't exactly  an
address-list? The obvious one is message-id, which I believe is a proper subset
of address syntax. I've seen quite a few scripts written that use address tests
on the message-id header. Heck, the way this is written fields defined with
Sender: or From: syntax would be excluded, and that's nothing short of absurd.

I will also point out that the standard doesn't say anything about throwing
errors in this case. "Restricting" a test could mean that, but I think having
the test fail is more than sufficient.

But let's suppose the specificaiton did say an error should occur. I have to
say I would happily violate the specification in that case. The entire point of
Sieve is to give users a rich set of tools they can use to process the messages
they receive, including some seriously messed up stuff. I'm sorry, but when the
need to be able to deal with the broken dreck people actually send around
conflicts with implemnetation purity, effective hanlding of the dreck wins.
Every. Single. Time.

> -> Given the variables extension, sometimes the specified header names
> aren't known until runtime. If the previous answer was to cause a
> compile error, should this abort the script at runtime?

I don't have variables (yet?). I expect that I would try to give an
error at compile time and to avoid runtime errors.

>    * RFC 5228 (Sieve) : 5.4.  Test envelope:
> "The "envelope" test is true if the specified part of the [SMTP] (or
> equivalent) envelope matches the specified key.  This specification
> defines the interpretation of the (case insensitive) "from" and "to"
> envelope-parts.  Additional envelope-parts may be defined by other
> extensions; implementations SHOULD consider unknown envelope parts an
> error."

Again, I give an error at compile time, none at runtime.


We check this at runtime. Changing things to perform a compile time check
wouldn't be difficult, at least in the common case of no variable
substitutions, but since this can't be done when ihave is turned on and we're
headed towards a situation where the vast majority of scripts we process use
ihave, I don't see any point in spending the time to implement this.

>   -> Given the variables extension, sometimes the specified envelope
>  parts aren't known until runtime. Should invalid ones abort the
>  script or is ignoring them a better practice?


A script that manages to do an envelope test on an nonexistant envelope part is
broken in a fairly fundamental way. This needs to be noted, and the way you do
that is to throw an error.

Sieve is a simple language... All commands in a script can be arranged
into basic blocks, and the basic blocks form a DAG. (I don't remember
whether this remains true with Ned's MIME/looping extension.)


It's not my extension, but since it doesn't change the syntax or structure of
Sieve I don't see how it could impact the blocking. But it certainly can make
static analysis more difficult if not impossible.

I wonder whether it is possible to walk along the DAG and say "this
assignment is invalid since it leads to an error when the variable is
used".


Sure, that's possible in a lot of cases, but not all. The envelope part name
could come from the message data itself (sounds like a really bad idea, but you
can certainly do it), so unless your compilation step is done in a
message-dependant way that's a fundamental limit on analysis. (We compile in a
message-independent way so the same compiled script can be applied to multiple
messages. We even have the ability to store compiled scripts on disk and read
them back.)

But even if message data isn't involved or can be built into the compilation
process, I'm pretty sure doing it in full generality is at least in NP.

The real question, then, is whether this check is worth the bother. Since I
don't think even checking static string arguments to envelope is worth doing at
compile time, it goes without saying where I stand on doing even more analysis
of this case.

                                Ned