procmail
[Top] [All Lists]

Re: invalid regexp?

2001-12-13 23:10:35

Whew.  Damn, you're good.  ;-)

On Thu, Dec 13, 2001 at 10:41:54PM -0600, David W. Tamkin wrote:

I'm assuming that your version of procmail is recent enough to allow
rematching on the contents of $MATCH.

Yup.

Let's look at the problem part:

[^(-[0-9]+)]

Here was my take on it.

I open a range, then negate its contents.  The contents is then an atom
(in parentheses).  The atom consists of a hyphen, followed by a run of
digits.  This sprang from

        * MATCH ?? ^\/[a-z0-9-]+[^(_at_)]

which ... well, you know.  The idea was to replace [^(_at_)] with something
that would match "what I don't want to see".  I would rebuild MATCH as
consisting of everything UP TO the last run of hyphen-followed-by-run-
of-digits.  So as I said, "foo-bar-123" would become "foo-bar", and
"bleh-987" would be converted to "bleh".

As you say, the whole thing is perturbed by the fact that the first
hyphen is interpreted as denoting a range from "(" to "[".  I hereby
acknowledge that I got lucky by creating a regexp that was actually
*broken* rather than merely misinterpretable.  :-)

Which leaves me where?

| I have a recipe with a condition that's supposed to strip /-[0-9]+/ off
| the end of MATCH.

That's ambiguous.  Are you trying to remove all hyphens, plus signs, and
digits from the right end of $MATCH and make sure it ends with a character
not among those eleven ...

Sorry, I thought the examples would mitigate the ambiguity.  What I want
could be achieved with `sed 's/-[0-9][0-9]*$//'`.  But I'd obviously
rather not launch sed if I can help it.

I guess we've sort of come back to my query of about a week ago with the
subject of "stripping MATCH's right hand side", which Martin McCarthy
replied to with "I can't think of a pure procmail solution off the top
of my head."  I ended up on the silly tangent of trying to set SHELL=sed.

or are you trying to remove a string from the end that consists specifically
of a hyphen, a numeral, and a plus sign ... even if the last character
before that hyphen is itself a digit or a plus sign?

Well, remove a string from the end that consists specifically of a
hyphen and one or more numerals.  The plus sign was supposed to be
part of the regexp.  ;)

 # lose everything after the last hyphen
 * MATCH ?? ^^\/.*-
 # lose everything after the last remaining character that isn't a hyphen
 * MATCH ?? ^^\/.*[^-]

Yup, that's about it.

If the last character before the hyphen to be stripped is another hyphen and
you want to keep it, I can't think of a way to do it within procmail --
well, maybe, but it would be really messy and we'd have to know in advance
how many consecutive closing hyphens there could be.

Not an issue.  As long as multiple MATCH recalculations are less costly
than a subshell to run sed, then that's the way I'll go.

Thanks again!  :)

-- 
  Paul Chvostek                                             
<paul(_at_)it(_dot_)ca>
  Operations / Development / Abuse / Whatever       vox: +1 416 598-0000
  IT Canada                                            http://www.it.ca/

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>