procmail
[Top] [All Lists]

Re: procmail blues: half-shod solution...

2002-12-29 05:29:02
Oh dear - rather dissapointed that my from tags are so horrible :(

Interestingly, after I sent that email I figured out a way to cope with
addresses like yours, with name <address>

Is there a built-in procmail tag which will extract the address only and
not the name if there is one?

Well here is the (well was the) final solution

####
#### poff's procmail runtime configuration
####
#### mise  jour: Sat Dec 28 22:32:48 CET 2002
####

####               #####
#### CONFIGURATION #####
####               #####

DEFAULT=$HOME/mail/inbox
MAILDIR=$HOME/mail

LOGFILE=$HOME/.proclog
LOG="
"

VERBOSE=yes

#### grab some variables to use later on in recipes
####

FROM=`formail -x"From:" | sed -e 's/[   ]*//'` #clean whitespace from from

:0
* ^From:.*[<]
{
        FROM=`echo $FROM | awk -F\< '{print $2}' | sed -e 's/>//'`

        # clean off "Name <name(_at_)address>" to name(_at_)address for both 
user check and
        # address check
}

DOMAIN=`formail -x"From:" | awk -F@ '{print $2}' | sed -e 's/>//'`

#####         #####
##### RECIPES #####
#####         #####

### mailing list rules
###

## grab majordomo stuff with no headers

:0:
* ^From:.*majordomo.*
lists

## grab stuff with "X-List" or "listar" in headers

:0:
* ? egrep -i "X-List|X-Loop|Precedence: list|List-Id:"
lists

### miscellaneous rules
###

## move school friends to "school-friends"

:0:
* ? grep -i "$FROM" .school-friends
school-friends

## move mad-sci to mad-sci

:0:
* ^From:(_dot_)*(_at_)www\(_dot_)madsci\(_dot_)org
mad-sci

### re-route sdf users
###

:0:
* ? egrep -i "$DOMAIN" .sdfdoms
"sdf'ers"

:0:
* ? egrep -i "$FROM" ".sdf'ers"
"sdf'ers"

:0:
* ? egrep -i "^$FROM:" /etc/passwd
"sdf'ers"

It deals fine with addresses like yours, but it wouldn't cope with the JOE
@ AOL <joe(_at_)aol(_dot_)com> address.

You know I rehash the procmailrc every time I get a mail from someone
which doesn't fit the rules - so who knows - maybe in a few months it will
work out ok!!

Thanks again for you help!

Poff

On Sat, 28 Dec 2002, Professional Software Engineering wrote:

Date: Sat, 28 Dec 2002 15:05:08 -0800
From: Professional Software Engineering 
<PSE-L(_at_)mail(_dot_)professional(_dot_)org>
Reply-To: procmail(_at_)Lists(_dot_)RWTH-Aachen(_dot_)DE
To: procmail(_at_)Lists(_dot_)RWTH-Aachen(_dot_)DE
Subject: Re: procmail blues: grepping from /etc/passwd for "SEND" users...

At 22:47 2002-12-28 +0100, poff(_at_)sixbit(_dot_)org did say:

FROM=`formail -x"From:" | sed -e 's/[   ]*//'
MUSER=`formail -x"From:" | awk -F\< '{print $2}' | sed -e 's/>//'
DOMAIN=`formail -x"From:" | awk -F@ '{print $2}' | sed -e 's/>//'

FTR, you need the _closing_ tic marks on those lines!

You shouldn't always assume that the address component will be the second
portion of the From: line.  What about emails where it's _just_ an
address?  How about _my_ mail (address, then parenthesised name)?  There
are WAY too many variations on email address lines to make such an
assumption.  The address may not be the _only_ component with an @ symbol
in it either - I've seen people with name text like: "Joe @ AOL" to
differentiate their own various email accounts.  My From: line is not the
only one which your parse fails with (to the extent that your DOMAIN=
assignment actually includes the name text).

Also, you'll save a few cycles if you just extract the From: within
procmail _once_ then echo that through to the other commands:

# my standard From: line extraction - you could roll YOUR from invocation
# into the operation performed here if you felt like it.
:0
* ^From:[       ]*\/[^  ].*
{
         FROM=$MATCH
}

xFROM=`echo "${FROM}" | sed -e 's/[     ]*//'
MUSER=`echo "${FROM}" | awk -F\< '{print $2}' | sed -e 's/>//'
DOMAIN=`echo "${FROM}" | awk -F@ '{print $2}' | sed -e 's/>//'


The timing difference between your method and this one across only a few
hundred messages (okay, 1515 of them conveniently supplied by my current
procmail archive, starting 01 SEP 2002) can add up:

formail on EACH invocation:
         186.87 user 213.31 system 7:12.68 elapsed 92%CPU

procmail extraction, then echo for each invocation:
         143.20 user 182.19 system 6:02.37 elapsed 89%CPU

There are other overheads at work - this is run from my sandbox setup.  An
earlier test with a smaller archive - which had the formail invocation as
the second run (and therefore benefitting from cacheing) also demonstrated
a slight speed difference in favour of using procmail $MATCH construct
instead of formail.

:0:
* grep -i "${DOMAIN}" .sdfdoms
"sdf'ers"

You're missing the '?' at the beginning of the condition line that says
"run this command and pay attention to the return code".  I see from a
subsequent post that you figured out the bug, but don't know about 'man
procmailrc', which explains these flags.

[snip - the full text of the previous post again]

---
  Sean B. Straw / Professional Software Engineering

  Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
  Please DO NOT carbon me on list replies.  I'll get my copy from the list.


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail


poff(_at_)sixbit(_dot_)org
SDF Public Access UNIX System - http://sdf.lonestar.org


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail