ietf-mta-filters
[Top] [All Lists]

Re: Notify extension to Sieve

2001-07-14 14:40:44

Simple solution to text[n] possibly not ending in a CRLF:

        It includes the first n-2 characters followed by a CRLF.

Also, I think most SMS device length maximums are defined in terms of
bytes, not characters.

        Tony Hansen
        tony(_at_)att(_dot_)com

Nigel Swinson wrote:

The text[n] construct is also fairly problematic. For example, what
happens
when the cut off falls in the middle of a line terminator? What happen
when th
e
number fall in the middle of a multibyte character?

A line-oriented construct would avoid these issues, however, it wouldn't
necessarily work right with message services that have limited text
capacity.


This is a good point. Trying to deal with byte boundrys with different
character sets is a major pain. I agree the output should be utf8 and the
'n' in text[n] should refer to n unicode characters not n octets (as
currently implied).

btw Thank you to everyone for the comments in the last couple messages.
We'll try to get them integrated and a new version out in the near future.


If there are going to be message services that have a limited text capacity,
then if we treat n as "n unicode characters", but then convert the "n
unicode characters" to utf8, then we have no accurate way of determining in
advance the size of the utf-8 string.

My Utf8 is a little rusty, but a character can end up as from 1 to 6 bytes,
so our output buffer may actually require to be anything from n to 6n bytes.

If we are to use UTF8, which I agree is a good idea, then perhaps we need
some kind of clause that protects the output buffer from being greater than
a certain limit while maintaining character boundaries?

Perhaps:

$text[n]$  - as many characters of the first text/* part, encoded as UTF8,
that will fit in an n byte buffer.

(Yuck?!)

Or maybe change the text[n] parameter to max-message-size[n] that is the
maximum size that the COMPLETE message can be where the message size
includes $from, $env-from$, $text$ etc.  This way you would specify how much
data you wanted to recieve, then just use the $text$ paramter and the system
will put as much UTF8 message data into the output message buffer as
possible.  The system MUST ensure that truncated messages are truncated at
the start of a character boundary.

On the other hand if we want to just rely on "implementations MAY shorten
the message for technical or aesthetic reasons" then I think that $text[n]$
would be better as n lines, as lines are always \n, and therefore easy to
find and convert as a onner, as opposed to finding character boundaries in
variable byte-to-character charsets like UTF8.

Incidentally if "implementations MAY shorten the message for technical or
aesthetic reasons", then we ought to say "Where they do, they MUST truncate
the message before a character boundary".

Just some ideas, I have little experience of the kinds of system that are
going to recieve these notifications...

As a side note, in the following examples from section 3.1, aren't they
missing a ":message":

         require ["notify","fileinto"];

            if header :contains "from" "boss(_at_)example(_dot_)org" {
                notify :high "This is probably very important";
            }

            if header :contains "to" 
"sievemailinglist(_at_)example(_dot_)org" {
                notify :low "[SIEVE] $from$: $subject$";
                fileinto "INBOX.sieve";
            }

And also in section 3.1 shouldn't it be ":message" not "message:" ?

    Syntax:   notify [":method" string]
               [":id" string]
               [":options" 1*(string-list / number)]
               [<":low" / ":normal" / ":high">]
               ["message:" string]

Nigel

<Prev in Thread] Current Thread [Next in Thread>