[Top] [All Lists]

Proposal for escaping on non-UTF-8 sequences in Sieve

2006-09-18 02:22:16

Here is a strawman syntax for new quoted strings that would only allow
for valid UTF-8 sequences, but would also allow for escaped non-UTF-8

  new-quoted-string  = "~" DQUOTE new-quoted-text DQUOTE

  new-quoted-text    = *(utf8-quoted-safe / quoted-special /

  not-qspecial       = %x01-09 / %x0B-0C / %x0E-21 / %x23-5B / %x5D-7F /
                       UTF8-2 / UTF8-3 / UTF8-4
                         ; a single Unicode character other than NUL,
                         ; CR, or LF, represented in UTF-8

  old-quoted-string  = DQUOTE quoted-text DQUOTE

  quoted-non-utf8    = "\" "x" 2*HEXDIG
                         ; represents a hex encoded octet

  quoted-string      = old-quoted-string / new-quoted-string
                         ; "new-quoted-string" is not allowed as
                         ; a parameter to "require" action.
                         ; "new-quoted-string" is only allowed after
                         ; 'require "utf8-strings";'

  utf8-quoted-safe   = CRLF / not-qspecial
                         ; either a CRLF pair, OR a single octet other
                         ; than NUL, CR, LF, double-quote, or backslash

As quoted strings can already contain multiple lines, I don't think we
need to introduce a new variant of the "multi-line" string. Please let
me know if you disagree.