Re: Poll: consensus to change the encoded-character extension


On Tue, 2007-04-10 at 19:12 +0000, Aaron Stone wrote:

Something like this:

   encoded-character    = "${" encoded-char-scheme ":" encoded-char-seq
"}"
   encoded-char-scheme  = hex / unicode
   encoded-char-seq     = *(LWSP WSP 1*HEXDIG) LWSP


if we allow ${hex:100} in the grammar, we need to say something in the
text about the valid range.  I would prefer to stick to separate
productions for encoded-arb-octets and encoded-unicode-char to keep the
text simple and to minimise the change to the text.

Note that LWSP is optional by definition,


ouch, good catch!

so we have to include SP or WSP
to force some kind of separator between 1*HEXDIG's. Note that this is not
valid according to the syntax above,

${unicode:
123
ABC
}

..because 123 and ABC do not have WSP between them. Use WSP / CR / LF? Is
there some variant of LWSP that mandates at least one character of
something be present?


LWSP requires WSP after CRLF, too, so it's simply not what we want, we
need to add another basic terminal, perhaps

   blank = WSP / CRLF

I suggest we stick to the poll question from Alexey, but with "1*blank"
replacing LWSP in his suggested new text.

I think there are three options for values that are out of range:

 1. Throw an error and reject the script.
 2. Ignore the offending value.
 3. Insert some placeholder like ' ' or '?'.


I don't think we need to revisit this question.

I concur that comments should not be allowed.


-- 
Kjetil T.

<Prev in Thread]

Current Thread

[Next in Thread>

Previous by Date:

Re: Poll: consensus to change the encoded-character extension, Kjetil Torgrim Homme

Next by Date:

encoded-character and unicode range violations, Michael Haardt

Previous by Thread:

Re: Poll: consensus to change the encoded-character extension, Aaron Stone

Next by Thread:

Re: Poll: consensus to change the encoded-character extension, Mark E. Mallett

Indexes:

[Date] [Thread] [Top] [All Lists]