Do implementations have to support encodings of NUL ala ${hex:0}? If so,
that will be a barrier to this extension being broadly supported.
(I think it should be a "MAY support".)
Note that =00 in quoted printable has to be dealt with already.
An implementation that uses nul-terminated strings is already broken
and things will not get worse using an encoded NUL character in a string.
Which comes first, encoding replacement or variable expansion? Or are
they concurrent? Whatever the answer, the variables I-D will need to make
that clear.
(I think encoding replacement should come before variable expansion.)
I think they should be processed inside-out, but not in separate
passes. Variables introduce of the concept of strings (aka test
and action arguments) not being literals, but string expressions that
are evaluated. It makes sense that arguments of string functions
turn from literals to string expressions, too.
As a result, without variables, "${hex:${hex:4646}}" is a syntax error.
With variables, it is "FF".
The next rev of the base-spec includes this:
Extensions MUST NOT change the behavior of the "require"
control command.
I believe that means that this extension can't be used in the capability
argument to 'require', so there's at least one place where Unicode
characters can't be encoded using ${unicode:...}.
That's right. The require argument can not contain variables either,
and I consider that a good thing [tm].
I'll note that while a script that uses this extension for all non-UTF8
octets may be displayable without munging, the result may still be
incomprehensible if, for example, an 8bit ISO-2022-JP MIME part is
included in a string. Indeed, such encoding will probably make it more
difficult to display that MIME part readably: currently, if a user is
viewing the script with raw ISO-2022-JP in it they can probably display
that MIME part by overriding their browser's charset encoding for the
page.
Indeed, a hack like that is not as useful any more. But it was a hack
to begin with, because overriding the charset encoding renders contained
UTF-8 text useless. Convert the ISO-2022-JP part to UTF-8.
Michael