[Top] [All Lists]

Re: status of 3028bis

2005-10-23 17:27:03

On Sun, 2005-10-23 at 23:29 +0100, Dave Cridland wrote:
On Sun Oct 23 22:02:38 2005, Ned Freed wrote:

(1) An octet-based comparator.
(2) A single ? used in isolation with no adjacent *s or ?s.
(3) Well formed UTF-8 as input.

The somewhat surprising result is that ? can only match an ASCII 
character. Of
course something like ???? can get really interesting and match 
anything that
encodes down to four octets. 

I think you intended to say that "?" can only match a character if it 
is within ASCII - or more generally, if it happens to encode to a 
single octet in UTF-8. But it'll match any octet, of course, whatever 
character it might happen to be part of the encoding for.

yes, but since the argument to :matches has implicit anchors, another
wildcard needs to follow the ?.  e.g., with "foo?", only US-ASCII can be
matched, since all UTF-8 sequences are multi-octet.

A construct like:

    require "variables";
    if header :matches "subject" "*" {set "subject" "${1}"}
    else {set "subject" ""}

ends up storing the subject in all caps, which likely isn't what 
was intended.

I think that's a matter of interpretation.

Variables says, in section 3.2, that the list variables expand to 
what the wildcard matched.

I see nothing saying that this must be in the internal transformation 
of the string by a comparator (if such a thing exists), nor that it 
should be those matching portions of the original string, but my gut 
feeling is that a comparator should be essentially a black box - that 
is, the internal transformations of the comparator shouldn't be 
visible to the script.

yes, the behaviour of wildcard matching is under-specified (you might
say "unspecified"), and trying to extrapolate it from the matching
algorithm is probably wrong, especially since no one will want that
Kjetil T.

<Prev in Thread] Current Thread [Next in Thread>