[Top] [All Lists]

Re: Proposal for escaping on non-UTF-8 sequences in Sieve

2006-10-20 03:06:35

Do implementations have to support encodings of NUL ala ${hex:0}?  If so, 
that will be a barrier to this extension being broadly supported.
(I think it should be a "MAY support".)

Note that =00 in quoted printable has to be dealt with already.
An implementation that uses nul-terminated strings is already broken
and things will not get worse using an encoded NUL character in a string.

Which comes first, encoding replacement or variable expansion?  Or are 
they concurrent?  Whatever the answer, the variables I-D will need to make 
that clear.
(I think encoding replacement should come before variable expansion.)

I think they should be processed inside-out, but not in separate
passes.  Variables introduce of the concept of strings (aka test
and action arguments) not being literals, but string expressions that
are evaluated.  It makes sense that arguments of string functions
turn from literals to string expressions, too.

As a result, without variables, "${hex:${hex:4646}}" is a syntax error.

With variables, it is "FF".

The next rev of the base-spec includes this:
      Extensions MUST NOT change the behavior of the "require"
      control command.
I believe that means that this extension can't be used in the capability 
argument to 'require', so there's at least one place where Unicode 
characters can't be encoded using ${unicode:...}.

That's right.  The require argument can not contain variables either,
and I consider that a good thing [tm].

I'll note that while a script that uses this extension for all non-UTF8 
octets may be displayable without munging, the result may still be 
incomprehensible if, for example, an 8bit ISO-2022-JP MIME part is 
included in a string.  Indeed, such encoding will probably make it more 
difficult to display that MIME part readably: currently, if a user is 
viewing the script with raw ISO-2022-JP in it they can probably display 
that MIME part by overriding their browser's charset encoding for the 

Indeed, a hack like that is not as useful any more.  But it was a hack
to begin with, because overriding the charset encoding renders contained
UTF-8 text useless.  Convert the ISO-2022-JP part to UTF-8.


<Prev in Thread] Current Thread [Next in Thread>