Sieve-Notify and potential associative arrays.


Looking at the long since expired Sieve Notify extension, it seems somewhat 
ugly adjacent to the much newer variables extension, and I would suggest needs 
a lot of work.  I've had a number of thoughts on the subject, some of which may 
have an impact on the variables extension, although they are possibly 
sufficiently separate that they would make an extension on their own.

It seems to me that the notify extension is trying to do too much in supporting 
notification by email, SMS or indeed any arbitrary notification mechanism.  It 
was pointed out quickly how many complex internationalization issues you have 
to deal with when composing emails, but you have completely different concerns 
when dealing with SMS messages, so I'm not sure it makes sense to bundle them.  
I think we should therefore have something more along these lines:

Syntax:   sms [":recipient" <recipient-numbers: string-list>] [:limit <number>] 
<message: string>

The :recipient tag specifies the target phone numbers to send this SMS to.  If 
not present, the implementation should try to send the SMS to the owner of the 
script where the number is held by the sieve implementation, ie a mailbox 
property.  If present, specifies the target number, and it can also be a list 
of numbers if more than one recipient is desired.  Each phone number must begin 
with a + and include the country code to ensure that the script will work 
regardless of location of server/script.

The message is a string, whereby variable expansion is also permitted. The 
limit is the maximum number of SMS messages that the server can send.  Given 
that each message typically has a cost associated with it, the limit by default 
will be whatever produces the least cost which in today’s terms is 1 message 
(140 bytes) but may change in the future.  They can use this optional tagged 
argument to override this such that they will receive a multipart SMS 
containing more content.  Although to be honest I'm not sure how multipart SMS 
messages work, but I did find a reference detailing that the standard is 140 
bytes http://en.wikipedia.org/wiki/Short_message_service#Technical_details.

Messages are of course therefore in UTF8 and again I'm not sure how you do 
internationalization in SMS but it does appear to be possible.  If we have our 
UTF8 message and our limit of the number of SMS messages we need to send, then 
that hopefully should be enough to do a transformation into the SMS gateway 
with the right encoding/character set.

The proposal in draft-ietf-sieve-notify-02.txt is this:
 
    Syntax:   notify [":method" string]
               [":id" string]
               [":options" 1*(string-list / number)]
               [<":low" / ":normal" / ":high">]
               ["message:" string]

I'm thinking it's got too many things in it which are trying to be 
super-generic to cover all uses (:method/:id/:options), but in actual fact 
we'll quickly regret this and prefer specific extensions for notifying through 
different channels with well defined and well documented arguments.

Which brings me on to variables.  Each of these different notification types, 
and also the vacation extension to a certain degree, has the need to author 
messages, and likely include sections of the triggering message.  So suppose I 
want to author what I think is a fairly reasonable SMS which looks like this:
           [To:<recipient-addresss>From:<sender-address>] <Subject>\r\n<body>

One way to get his is using the proposed $name$ variables which seem pretty 
ugly next to what we've worked so hard on with the variables extension, and 
also is pretty inflexible.  If we use the variables extension as is, then we 
could do this:

if header :matches “*” “To” {
  set “Recipient” “${1}”
}
if header :matches “*” “From” {
  set “Sender” “${1}”
}
if header :matches “*” “Subject” {
  set “Subject” “${1}”
}
set “Message” “[To: {$Recipient} From: {$Sender}] {$Subject}”

(I note that my intuative use of * won't work cos * in matches according to 
variables is non-greedy, yet * with regex is greedy, but I thought I'd do that 
deliberately to make you think...)

This seems like a lot of work to do something pretty "standard".

I wonder if it would make sense for us to add associative arrays containing 
entries of specific interest to the sieve script author.  So for example a 
$HEADERS array which contained all the header values.  So suppose my headers 
were:

Received: from p01m168.mxlogic.net [bla bla]
Received: from unknown [208.184.76.39] [bla bla]
Received: from above.proper.com  [bla bla]
Received: (from majordom(_at_)localhost) 
        by above.proper.com (8.12.11/8.12.9/Submit) id j0RHuuq8038984;
        Thu, 27 Jan 2005 09:56:56 -0800 (PST)
Date: Thu, 27 Jan 2005 09:56:56 -0800 (PST)
To: Nigel(_dot_)Swinson(_at_)rockliffe(_dot_)com
From: majordomo(_at_)vpnc(_dot_)org
Subject: Welcome to ietf-mta-filters

Then I would end up with an array like this:

This would produce an array with these entries:
$HEADERS[‘Received’][0] = “from p01m168.mxlogic.net [bla bla] ”
$HEADERS[‘Received’][1] = “from unknown [208.184.76.39] [bla bla] ”
$HEADERS[‘Received’][2] = “from above.proper.com  [bla bla] ”
$HEADERS[‘Received’][3] = “(from majordom(_at_)localhost) ”
        .”by above.proper.com (8.12.11/8.12.9/Submit) id j0RHuuq8038984;”
        .“Thu, 27 Jan 2005 09:56:56 -0800 (PST)”
$HEADERS[‘Date’][0] = “Thu, 27 Jan 2005 09:56:56 -0800 (PST)”
$HEADERS[‘To’][0] = “Nigel(_dot_)Swinson(_at_)rockliffe(_dot_)com”
$HEADERS[‘From’][0] = “majordomo(_at_)vpnc(_dot_)org”
$HEADERS[‘Subject’][0] = “Welcome to ietf-mta-filters”

The entries in the array would have RFC2047 encoding removed, stored in UTF8, 
and header wrapping would be removed such that the headers would have no new 
lines in them.  The \r\n at the end would be removed also.  (Note the “.” 
String concatenation operator is not available in Sieve, but used above to 
demonstrate the removal of header wrapping).  The implementation would be able 
to evaluate the entries of the associative array in a lazy manner to avoid 
proceesing.

When doing string expansion, if an array is used the entries in an array would 
be appended together, which would of course not include the \r\n that was in 
the original headers, and not do the header wrapping, which means it would be 
entirely unsuitable for use as email headers. If in future if we need a way of 
turning an array into a set of headers, then we can offer a parameter to the 
action that would take the array and the implementation would turn these into a 
valid header set.

With the above strategy, we now have this:
set “Message” “[To: ${HEADERS[‘To’]} From: ${HEADERS[‘From’]}] 
${HEADERS[‘Subject’]}”

I then wonder if we should have other arrays like $ENV or $MESSAGE or $ENVELOPE 
and could hold information like the name of the mail server, the name of the 
mailbox, the domain, the time zone, the spam score, or virus score.  It could 
be the way that extensions provide their variables to us rather than what is 
suggested in the variables draft:
  
4.  Interpretation of strings
   Namespaces are meant for future extensions which make internal state
   available through variables.  These variables SHOULD be put in a
   namespace with the same name as its capability string.  Notice that
   the user can not specify a namespace when setting variables with SET.

This does mean that we could do header tests by using the string test, and the 
$HEADERS array, and I guess that's not a bad thing, but I'm not sure if it's a 
good thing either...

Ah, it feels better to get all that down on paper :o)

Nigel