Per discussion during the meeting yesterday, I'm fleshing out the
options on how to provide character escapes here on the list:
0) do nothing; just repeat that \x means x, period.
PRO: - no changes to the spec
- breaks no implementations...unless they don't conform already
CON: - scripts have no way to get strings that contain NUL or
invalid UTF-8
1) change the base spec to define \xFF, \uXXXX, etc
PRO: - simplest to specify that adds the support
- support is easy and consistent; no need to change string
interpreters in mid-stream
CON: - breaks all implementations
- breaks scripts that use superfluous backslashes
- escapes only usable in quoted strings
- scripts that need escapes can't guarantee they're getting them
(scripts would not be portable between versions)
2) change the base spec to say the \x maps to x unless overriden
by an extension; extensions may redefine any \x except \\ and
\". Scripts SHOULD NOT contain extraneous escapes. Then, create
an extension which defines \xFF, \uXXXX, etc
PRO: - neither implementations nor scripts broken by the change
- script that needs escapes is guaranteed they're getting them
if they're supported
- implementation similar to variable (or is that a CON?)
CON: - more complicated to specify
- another extension has to be defined and used when needed
- escapes only usable in quoted strings
- does there need to be a registry for the redefinitions
to prevent conflicts between such extensions?
3) define an extension to variables that implicitly creates variables
(in a namespace) for each unicode codepoint and octet value whose
values are the name codepoints/octets (e.g., ${unicode.00bf}
would contain the UTF-8 representation of U+00BF (inverted
question mark); ${octet.ff} would be the octet with value
255, which is not valid UTF-8)
PRO: - neither implementations nor scripts broken by the change
- script that needs escapes is guaranteed they're getting them
if they're supported
- usable in both quoted strings and multiline literal
- avoids introducing another area of extension (c.f. last
CON of (2))
CON: - more complicated to specify
- more annoying/noisy to use
- another extension has to be defined and used when needed
- requires support for and use of variables
4) define an extension that covers all the changes in the base spec
that are incompatible with RFC 3028. Option (1) would be done
under that extension. If there are no other incompatible changes
then this reduces to (2)
PRO: - neither implementations nor scripts broken by the change
- script that needs escapes is guaranteed they're getting them
if they're supported
CON: - medium complexity?
- another extension has to be defined and used when needed
- escapes only usable in quoted strings
- negative experience with version numbers in IMAP
- no longer revising Sieve base spec but rather defining Sieve v2
Are there other options? Did I miss (or misstate) any PROs or CONs?
Which of these PROs and CONs should be considered important and why?
Philip Guenther
editor