procmail
[Top] [All Lists]

Re: Domain based sorting

2011-08-18 13:14:58
LuKreme <kremels(_at_)kreme(_dot_)com> squawked out on Thursday 
18-Aug-2011@04:57:46
One thing I will have to fix is that the FROM_DOMAIN will contain, for 
example, mx3.domain.tld and I want it to contain just “domain”. That’s 
trivial though (And in fact, I may have to check the procmailrc, but it might 
already be grabbed into a variable I’ve forgotten about).

OK, this should have been trivial enough, but my fu has failed me.

given Sean’s CELAN_FROM and FROM_DOMAIN I tried:

:0
* FROM_DOMAIN ?? .*\/([^\.]+)
{ ROOT_DOMAIN = $MATCH }

which works if the from domain is “domain.tld” but fails if the domain is 
“mail.domain.tld” (I get “mail”)

So, I tried to anchor it to the end:

:0
* FROM_DOMAIN ?? .*\/([^\.]+)\....?$
{ ROOT_DOMAIN = $MATCH }

but that always gives me “domain.tld” which confuses me because I thought the 
() match gave that portion to $MATCH

So, I started to think (dangerous, I know) and I searched and found Sean’s post 
from a few of years ago about dealing with getting domains in domain.co.uk 
sorts of situations:

Professional Software Engineering 
<PSE-L(_at_)mail(_dot_)professional(_dot_)org> squawked out on Sunday 
25-Jan-2009@14:30:44
# first, match the domain down to JUST the rightmost two tokens
:0
* FROMDOMAIN ?? [@.]?\/[^@.]+\.([^.]+|[^.][^.]\.[^.][^.])$

Again, that implies to me that my second aforementioned recipe should be 
working.

{
       TOPDOMAIN=$MATCH

       # next, get the domain portion - this is everything up to,
       # but not including the first dot.
       :0
       * MATCH ?? ^\/[^.]+
       {
               DOMAIN=$MATCH

And it looks to me like this would result in ‘mail’ as well.

       }

       # we need to fall back to the saved TOPDOMAIN and get the
       # TLD portion - this is everything AFTER the domain and a dot.
       # this implementation allows for two-part TLDs (co.uk for example)
       # because the RHS of this condition includes a variable which
       # needs to be expanded, we use the $ flag on the condition.
       :0
       * $ TOPDOMAIN ?? ^$DOMAIN\.\/.*$
       {
               TLD=$MATCH
       }
}

Now, in this specific case the emails we are trying to match are all in the 
domain.tld format and not the domain.xx.yy format (these are catalog sales and 
she’s not buying anything from the UK or canada or if she is they are using 
.com addresses) but it would be nice to get this right.


-- 
It would be a pretty good bet that the gods of a world like this
probably do not play chess and indeed this is the case. In fact no gods
anywhere play chess. They haven't got the imagination. Gods prefer
simple, vicious games, where you Do Not Achieve Transcendence but Go
Straight To Oblivion; a key to the understanding of all religions is
that a god's idea of amusement is Snakes and Ladders with greased rungs.


____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)de
http://mailman.rwth-aachen.de/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>