procmail
[Top] [All Lists]

Re: invalid regexp?

2001-12-13 21:56:51
Paul asked,

| but it's giving me an error:
|
| procmail: Invalid regexp "^\/[a-z0-9-]+[^(-[0-9]+)]"
|
| What's wrong with it?  It doesn't look particularly wrong to me....

You're trying to nest brackets inside other brackets, and that is going to
give you unexpected results; by messing around with parentheses in the same
place, you actually lucked out and made the regexp invalid.  If you hadn't
done that, you'd have had a valid one that meant something other than what
you thought, and you'd have gone nuts figuring out why procmail was happily
doing the wrong thing.  The truth, Paul, is that you dodged a bullet.

Let's look at the problem part:

[^(-[0-9]+)]

You open a range, negate it, include (actually, exclude, because it's
negated) the range from left parenthesis (which is literal inside brackets,
but I think you want it to be) through left bracket (which is literal inside
brackets, but that isn't what you expected); then you exclude (because the
range is negated) the range from zero to nine -- redundantly because you
already covered everything from left parenthesis through left bracket.  Then
you close the range with a right bracket, which you apparently thought would
somehow be parallel to the second left bracket, but it isn't.  The next
character after something outside the range must be a plus sign.

Then you have an unmatched right parenthesis.  Bingo.  The left parenthesis
is taken literally because it is between brackets (the first left bracket
and the first, not the second, right bracket), so the unescaped right
parenthesis that is outside brackets must act as a grouper, and it has no
left parenthesis to match it.  That's why the regexp is invalid.

Finally you have a right bracket with no left bracket, which is taken
literally, but by then things are so messed up that it no longer matters.

So what should you have instead?  You said,

| I have a recipe with a condition that's supposed to strip /-[0-9]+/ off
| the end of MATCH.

That's ambiguous.  Are you trying to remove all hyphens, plus signs, and
digits from the right end of $MATCH and make sure it ends with a character
not among those eleven ...

 # lose everything after the last character that is neither a hyphen,
 # a digit, nor a plus sign
 * MATCH ?? ^^\/.*[^-0-9+]

or are you trying to remove a string from the end that consists specifically
of a hyphen, a numeral, and a plus sign ... even if the last character
before that hyphen is itself a digit or a plus sign?

 # lose everything after the last hyphen
 * MATCH ?? ^^\/.*-
 # lose everything after the last remaining character that isn't a hyphen
 * MATCH ?? ^^\/.*[^-]

If the last character before the hyphen to be stripped is another hyphen and
you want to keep it, I can't think of a way to do it within procmail --
well, maybe, but it would be really messy and we'd have to know in advance
how many consecutive closing hyphens there could be.

I'm assuming that your version of procmail is recent enough to allow
rematching on the contents of $MATCH.



_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>