procmail
[Top] [All Lists]

Re: Domain based sorting

2011-08-18 14:17:36
At 11:14 2011-08-18, LuKreme wrote:
given Sean's CELAN_FROM and FROM_DOMAIN I tried:

:0
* FROM_DOMAIN ?? .*\/([^\.]+)

break it down:
        .*              match zero or more of anything
        \/              start match capture
        ([^\.]+)        match one or more of anything NOT a dot

that regexp is intended to grab the first domain token.

since the front isn't anchored, there's no real need for .* before the match operator (though if you drop them, owing to some parsing issues in procmail, you should have a () before the match trigger). Now, if you KNOW there's a hostname (or, by your presumption that you won't be dealing with domain.co.uk style domains), you could count dots

Try:

* FROM_DOMAIN ?? ()\/[^\.]+\.[^\.]+$

That'll capture the last two nodes of a domain specification

        mail.domain.tld -> domain.tld
        domain.tld -> domain.tld
        foo.mail.domain.tld -> domain.tld
        host.demon.co.uk -> co.uk                       (!!!)

which works if the from domain is "domain.tld" but fails if the domain is "mail.domain.tld" (I get "mail")

So, I tried to anchor it to the end:

:0
* FROM_DOMAIN ?? .*\/([^\.]+)\....?$
{ ROOT_DOMAIN = $MATCH }

but that always gives me "domain.tld" which confuses me because I thought the () match gave that portion to $MATCH

No, EVERYTHING matching the regexp after \/ is stored in $MATCH. Parens merely group (say for a zero or more, or a series of or's):
        (mail\.)?
        (foo|bar|baz)\.
(as examples of syntax, not as examples of anything you should be employing for your present task).

If you anchor to the end of the string, then you're assuring you'll match to the end (assuming the prior regexps match)

---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)de
http://mailman.rwth-aachen.de/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>