At 11:14 2011-08-18, LuKreme wrote:
given Sean's CELAN_FROM and FROM_DOMAIN I tried:
:0
* FROM_DOMAIN ?? .*\/([^\.]+)
break it down:
.* match zero or more of anything
\/ start match capture
([^\.]+) match one or more of anything NOT a dot
that regexp is intended to grab the first domain token.
since the front isn't anchored, there's no real need for .* before
the match operator (though if you drop them, owing to some parsing
issues in procmail, you should have a () before the match
trigger). Now, if you KNOW there's a hostname (or, by your
presumption that you won't be dealing with domain.co.uk style
domains), you could count dots
Try:
* FROM_DOMAIN ?? ()\/[^\.]+\.[^\.]+$
That'll capture the last two nodes of a domain specification
mail.domain.tld -> domain.tld
domain.tld -> domain.tld
foo.mail.domain.tld -> domain.tld
host.demon.co.uk -> co.uk (!!!)
which works if the from domain is "domain.tld" but fails if the
domain is "mail.domain.tld" (I get "mail")
So, I tried to anchor it to the end:
:0
* FROM_DOMAIN ?? .*\/([^\.]+)\....?$
{ ROOT_DOMAIN = $MATCH }
but that always gives me "domain.tld" which confuses me because I
thought the () match gave that portion to $MATCH
No, EVERYTHING matching the regexp after \/ is stored in
$MATCH. Parens merely group (say for a zero or more, or a series of or's):
(mail\.)?
(foo|bar|baz)\.
(as examples of syntax, not as examples of anything you should be
employing for your present task).
If you anchor to the end of the string, then you're assuring you'll
match to the end (assuming the prior regexps match)
---
Sean B. Straw / Professional Software Engineering
Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
Please DO NOT carbon me on list replies. I'll get my copy from the list.
____________________________________________________________
procmail mailing list Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)de
http://mailman.rwth-aachen.de/mailman/listinfo/procmail