procmail
[Top] [All Lists]

Re: Separate incoming mail into 4 categories

2007-01-04 14:10:03
At 22:43 2007-01-04 +0800, DR. Lee - NS3 wrote:

I am trying to develop a simple mechanism to separate the incoming mail
into four categories by using a from_list and a to_list. From_list
contains the fully qualified e-mail addresses of the known senders and
the to_list contain the addresses of the known recipients.

The logic is as following:

The Procmail mantra, please recite it with me:

         Procmail is not an MTA.

If someone addresses somehting with a BCC, you're hosed because you won't 
see the contents of that header.  If the path to you involved the message 
being sent twice (but with the same cleartext addressees), you'll end up 
distributing it twice to BOTH sets of recipients.

It certainly appears that you're trying to manage a mailing list in a 
convoluted fashion.

 if it is in the (to_list) then
   it is forwarded to a match_both mailbox if the it is also match
from_list
   otherwise goes to match_to mailbox
If it is not in to_list but match from_list, it goes to match_from
  otherwise it goes to match_neither.

Since the To: address sometimes has a <> sometime does not, the program
checks both type..

Note that the To: address may contain multiple addresses.  If your recipe 
as written enounters such a header, you're only going to grab the first token.

This program looks like working superfically but not exactly; the
problem seem to do with how grep -f works. After use gawk to yank out
the mail address like kflee(_at_)penit(_dot_)com, the match of this against the
list seem behave inconsistently.

Check my sandbox config - in there, I have various header address 
extractions.  See if they work for what you're trying to accomplish.

You should try extracting the addresses SEPARATE from the grep, then emit 
them to the VERBOSE log with a character delimiter around them to ensure 
you're getting the string you believe you're getting.  Since the string is 
part of a pipeline, you don't see it as a passed argument anywhere, even if 
you're running with VERBOSE logs.  i.e. you THINK you have an address like 
"kflee(_at_)penit(_dot_)com" but you very probably do not.

There's another very good reason to do the header extractions separatley 
and assign them to variables: then you're only doing that expensive awk 
pipeline ONCE for each of the two headers, instead of TWICE.  Get the from, 
massage it and save it.  Get the to, massage it and save it.  Then do your 
lookups (also only once each).  File mail based on the saved results of the 
two lookup operations.

Follows is a wholly untested rewrite which might help to get you on the 
right track:


LOGFILE='/home/kfl_root/log'
VERBOSE=yes
MATCH_TO='match_to(_at_)penit(_dot_)com'
MATCH_FROM='match_from(_at_)penit(_dot_)com'
MATCH_BOTH='match_both(_at_)penit(_dot_)com'
MATCH_NEITHER='match_neither(_at_)penit(_dot_)com'

# (pulled from my sandbox)
# get the From: address as an address component ONLY (no comments)
:0 h
CLEANFROM=|formail -IReply-To: -rtzxTo:

# You'll want to do some further scrubbing of this (esp if there are
# multiple addresses), but doing this extraction spares you having to
# perform a grep at the start of your pipeline.
:0
* ^To:[         ]*\/[^  ].*
{
         TO=$MATCH
}

# this should strip standard comments, and address bracketing,
# then put each result on a separate line (the TR is for that)
MY_TO=`echo ${TO}| sed -e "s/\"[^\"]*\"//" \
         -e "s/\(<\([^>]*\)>\)/\2/g" -e "s/^[    ,]*//" \
         | tr -s "       ," "\n"`

# (do lookups)
# the file lookups are performed this way because you then have the
# actual match string in the variable, which is a lot more useful as
# a diagnostic.  You can simplify later if you choose.
:0
* ! MY_TO ?? ^^^^
{
         TO_MATCHED=`grep -iF ${MY_TO} to_list`
}

:0
* ! CLEANFROM ?? ^^^^
{
         FROM_MATCHED=`grep -i ${CLEANFROM} from_list`}{
}


# (take action)
:0
* ! FROM_MATCHED ?? ^^^^
{
         # from matched
         :0
         * ! TO_MATCHED ?? ^^^^
         ! ${MATCH_BOTH}

         :0
         ! ${MATCH_FROM}
}

:0
* ! TO_MATCHED ?? ^^^^
! ${MATCH_TO}

:0
! ${MATCH_NEITHER}

---
  Sean B. Straw / Professional Software Engineering

  Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
  Please DO NOT carbon me on list replies.  I'll get my copy from the list.


____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>