On Sat, 21 Oct 2006, Kjetil Torgrim Homme wrote:
here's my amended 2.4.2.4, which attempts to fix it. the wording is a
bit weasely, but I think it will work in practice. if anyone can fix it
more formally in ABNF, please help out.
...
+ encoded-seq = "${" enc-method ":" enc-argument "}"
+ enc-method = "hex" / "unicode"
...
+ Values for enc-method or enc-argument which don't match the above
+ syntax SHOULD cause a syntax error.
Hmm, that won't work, as there isn't a defined meaning for something to
not match _part_ of a syntax. The only sensible interpretation I can see
would be to match *anything* for enc-method and enc-argument, and then
compare them to their expected forms. The problem, with that is that it
would render this string a syntax error:
"${name}: ${value}"
because it's an attempt to use encoded-seq with an enc-method of "name}"
and a enc-argument of " ${value". That's obviously not the desired
result.
To obtain the desired result, we have to give the syntax for all the
sequences that should be covered by the encoded-character extension, both
those that have a defined expansion and those that should be treated as a
syntax error. How broad do we want to make that? The broadest would
cover any sequence matching this:
encoded-seq = "${" enc-method ":" enc-argument "}"
enc-method = *(%x01-39 / %x3b-7c / %x7e-ff)
; zero or more characters other than ':' or '}'
enc-argument = *(%x01-7c / %x7e-ff)
; zero or more characters other than '}'
I.e., any sequence that starts with '${', followed by zero or more
octets other than ':' and '}', followed by a ':', then zero or more
octets other than '}', then finally a '}', would be considered a use of
the encoded-seq syntax.
To put it another way, if you find a '${', and there's at least one colon
between that and the next '}', it's an encoded-seq.
That would leave
"${name}: ${value}"
with its expected value, but make this:
"${name: sdlkfjs}"
or this:
"${a.44:}"
a syntax error. On the other hand, this:
"${hex.ff}"
would _not_ be a syntax error, because it doesn't contain a colon between
the braces. That's correct, of course, because we _want_ it to be a
variable reference instead.
The above is the broadest syntax as makes sense. We could tighten it up
and only cover sequences where the enc-method is, say, alphanumeric. If
we did that, then this:
"${a.44:}"
would not be a syntax error. Opinions?
Philip Guenther