> Whitespace is used to separate tokens. Whitespace is made up of
> tabs, newlines (CRLF, never just CR or LF), and the space character.
> The amount of whitespace used is not significant.
This is specifically about whitespace in the Sieve language. So... how
many implementations violate this? i.e., does everyone generate an
error if a script's whitespace contains a naked CR or LF ?
I do. Actually, the Exim implementation does not accept CRLF by default,
but uses just LF or whatever \n is. You can decide to change that by
recompiling Exim with a different #define or by translating the Sieve
script from CRLF to LF at runtime, which isn't too hard with Exim.
No matter if LF or CRLF is used, anything but the specified line
terminator causes a syntax error. So far, everybody was happy with that.
3028bis gives a detailed formal specification of lexical tokens.
Anything not covered by that causes a lexical error. There is no room
for interpretation on that.
Ignoring this lexical error violates the specification, because it changes
the recognised language. So does changing the specification of CRLF,
like I did.
For a change, I see no need to change the specification. :)
2.4.2.4. MIME Parts
> In a few places, [MIME] body parts are represented as strings. These
> parts include MIME headers and the body. This provides a way of
> embedding typed data within a Sieve script so that, among other
> things, character sets other than UTF-8 can be used for output
> messages.
This still confuses me. Where is it referenced?
The vacation extension uses it and I could imagine more extensions to
do so. It confused me, too. Perhaps something should be said why the
base spec defines it?
2.10.6. Errors
> When an error happens, implementations MUST notify the user that an
> error occurred, which actions (if any) were taken, and do an implicit
> keep.
Probably extremely nit-picky, but I've always wondered when I read
this "what user, and how do we notify them?" Some users have no
access to anything other than their mailboxes. I suspect that many
implementations will do some system-wide logging, which notifies the
admin, but not the user.
That's still on my TODO list. Since messages are filed to inbox, and
I don't want to potentially double the number of messages, I thought of
adding a header that tells why the message went to the inbox.
The spec provides no way to access the "From_" line (which is the
"From<sp>" line with no colon that is added by some mail software.
While it's not part of RFC2822, and admittedly a minor concern at
best, that 'header' is often there, but inaccessible. I have no
solution, other than perhaps making "From_" a special header name.
Storage specific data should go into storage specific extensions.
With Exim, the BSD mailbox postmark is generated when storing a message,
i.e. as result of a Sieve fileinto, so you could not access it in Sieve,
but that's an implementation detail. Other implementations might work
the same. Accessing the current date and time of the sieve execution,
which is the closest you could get, clearly belongs to a different
extension, too.
There was some discussion about adding character escapes (\u etc);
this probably falls under the "substantive changes or additions"
prohibition, but maybe not.
I am afraid it does. The base spec could say that extensions might
define a new meaning to quoted-other in order to explain why it SHOULD
NOT be used.
Michael