ietf-mta-filters
[Top] [All Lists]

Re: Matching NUL characters

2003-04-04 01:11:56

[michael(_at_)freenet-ag(_dot_)de]:

  as defined by RFC 3028, Sieve scripts may not contain NUL
  characters.  MIME encoded headers may contain them (encoded),
  though, and there is currently no way to match them with Sieve.
  For that reason I suggest to change paragraph 2.4.2 of the RFC to
  define \0 being a representation of the NUL character.
  
  I know, that's quite a change, but it's quite a bug, too. ;-) Btw,
  both Mutt and Outlook Express don't show anything after a MIME
  encoded NUL characters, so they simply decode it and, given C
  strings, terminate the string that way, because nobody ever
  thought about how to handle them.

since the wording is quite explicit in the Sieve standard, I'd rather
say that this was intentional.  it complicates implementation to
support the NUL character, for little benefit.

  If a sieve implementation is not subject to this bug, then most
  likely it could support \0 pretty easy.  Otherwise some work is
  required anyway to fix the bug.

I suggest you propose an extension.  an implementation that follows
the current RFC should not be rendered incompatible.

  While extending the meaning of \, I suggest to use the Java \u
  notation to specify unicode characters.  Sieve scripts are written
  in UTF-8 for good reasons, but being able to use non-ASCII
  characters without having to use UTF-8 would be great.  That's
  just an idea, though, and not really required like \0 is.

in that case I suggest an arbitrary length of hexadecimal digits,
e.g., \u0 == \u00000000.  otherwise we'll get a mess when characters
outside the basic plane are needed.  (encode to UTF-16 yourself, let
the Sieve implementation convert that to UTF-8 -- ugly!)  note that
you can terminate the hex sequence with a backslash, this never gives
ambiguity since the only escaped characters are backslash, doublequote
and "u".  e.g.,

  \u0\foo => NUL f o o
  
if you want the "\0" escape, you can't express U+ef followed by zero.

  \uef\0 => U+ef NUL

"\u0" is so short "\0" is superfluous.  the number of escapes whould
be kept to a minimum.

-- 
Kjetil T.

<Prev in Thread] Current Thread [Next in Thread>