Re: Implementing encoded-character

Hmm, right, variables contain no arguments and we don't have functions
yet.  Thinking about string expressions, I certainly would like to
have CRLF as white space, but I also would like embedded comments in
that case.  Just looking at encoded-character, I see no need for CRLF
and even have an odd feeling with, but considering it as syntactic
prototype for string expressions, both CRLF and comments sound useful.


I kind of like the idea of things that look like variables but are
functions operating on the right side of the colon.

We had a bit of discussion in Prague about list expansions that access
external data sources. This would certainly be one way to handle it,
though we'd have to be careful about strict vs. lazy evaluation. Anyhow,
that should probably be the subject of a separate thread.


I suggest to look at Exim and the Exim filter, and their string
expressions, as a live and working example that's very similar.

"${unicode:200000}" -> error
"${unicode:2000000}" -> "${unicode:2000000}"


Ugh, if it looks like encoded-char and walks like encoded-char...


That's the point.

My test implementation left-shifts the current value of the encoded
character, then adds the next hex digit. When it hits whitespace, it
checks if the value is within appropriate bounds; if so, stores the
character then loops, if not, stores '?' then loops. Would we really
rather be very strict about this? I'm in favor of some flexibility.


Given you use C and unsigned integers, or signed ones on a common
architecture where overflows are ignored, you already have a problem.
You could stop aggregating after the 6th nibble, simply parsing more,
and then generate an overflow error if there are really more.

I suggest to make the specification a bit more flexible and have
implementations obey it strictly.

Michael