[Top] [All Lists]

Re: draft-ietf-sieve-3028bis-08.txt

2006-07-26 03:49:05

On Tue, Jul 25, 2006 at 06:17:42PM +0200, Kjetil Torgrim Homme wrote:
speaking of CRLF, I'd like a clarification of multi-line strings in
section 2.4.2 (some of its text duplicates the above, I'm not sure
that's good).  something like:

   Any CRLF before the final period are considered part of the string.

to make it a little more clear that implementations should NOT change
the CRLF into its local line delimiter sequence.

I second that.  Although obvious, once you think about it, it might
nevertheless save implementors some time.  CRLF line breaks in string
literals cause the string to contain CRLF, nothing else.

I don't like this at all.  keep it simple, force the scripts to be
encoded in UTF-8, it saves us a lot of grief and edge cases.  to be able
to express arbitrary octets, add an extension for \x -- I think someone
volunteered to write text?  if not, I'll be happy to.  note that the
contents of a string during execution is potentially arbitrary octets
(even NUL, as made clear in 2.7.2).

If a script contains arbitrary octets in string literals, UTF-8 aware text
processing tools will fail in decoding UTF-8.  Additionally, although
breaking a lot that way, Sieve still couldn't match NUL characters,
so we need an extension for matching truely arbitrary octets anyway.

I am not so sure something like \x is a good idea.  How should
a comparator working on UTF-8 characters (multibyte sequences) act
when it hits a string containing a code violation? How should "*" and
"?" work in that case? Right now, that can't happen.  Oops.  Actually,
it can, when "?"  matches a single octet of a multibyte sequence, using
an octet-comparator, and you build a new string from the matched part.

In past discussions, we spoke of a new match that could take a hex
string as argument, and agreed the first person needing it badly will
write the draft.  As well as matching raw headers, nobody did. :)


<Prev in Thread] Current Thread [Next in Thread>