procmail
[Top] [All Lists]

Re: Extracting Email Address In From: Field

1997-05-26 11:43:00
On Mon, 26 May 1997, era eriksson wrote:
--
 >> > to come up with a way to do this with formail and figure sed would have 
to
 >> > be called. Sed is a bit beyond my comprehension. Any ideas? 
 >> >         FROM=`formail -ztxFrom:`
 >> You see this used a lot;
 >> FROM=`formail -rtzxTo:`
 > I use this in my working listserv recipe because it does look for the 
 > real reply to address, minus the colon: so procmail extracts from the To 
 > pseudo header field. There is a special case, read further into my 
original 

Huh? There is no "pseudo header field" without a colon. If you leave
off the colon, you still get the normal To: field, plus potentially
Tomato:, Toast:, and Toenails:. 

My mistake. I've been staring at this thing for a couple of days now and 
started halucinating. Originally, I used:

        `formail -zrtxTo`



 > post, where I need to match the address in the From: field. I'm just 
 > trying to save a bit of fingerwork on my part. If I bounce a message to 
 > the list, I don't get a cc but the original poster does; which I am 
 > trying to avoid. I want to strip the name and <> brackets so I can keep a 
 > file of the bare email addys. 

If you really want to get rid of the brackets, sed -e 's/[<>]//g'
should do that (but probably not be worth it -- I'd keep the brokets
in the file instead). 

At a later date, I was thinking about using formail to automatically 
extract the member's address from the 'From' field, since it is so 
proficient at it, and add it to the memberlist file. The only reason I 
want to match the email address in the 'From:' field, is so I can bounce 
messages to the list and the sender will not receive a cc. Otherwise, if 
I match the 'From' field, I am the sender for the bounced messages and 
the original submitter receives a cc.


If you really insist that you want the contents of the From: field
under all circumstances, you run into a bit of trouble because parsing
that is not trivial under all circumstances. But if you're content
with an approximative method, try running the following on the output
of formail -zxFrom: 
  sed -e 's/ *([^)]*) *//g' -e 's/.*<\([^>]*\)>.*/\1/g'

I'll insert this into the recipe and see what pops up.


The sed substitution command is probably worth getting acquainted with
if you spend time thinking about these things. Let's take the first
one apart:

 -e       What follows is a line of sed script:
--

Thanks for breaking the sed substitution commands down. They really fry 
my brain cells.


This can and will break with elaborate parenthesized comments in the
From: header -- RFC822 permits quite complicated expressions, although
in practice you rarely see anything other than the following three
variants:
  From: address(_at_)host(_dot_)domain(_dot_)com
  From: address(_at_)host(_dot_)domain(_dot_)com (I am the Walrus)
  From: The Walrus <address(_at_)host(_dot_)domain(_dot_)com>

The list is so small, I doubt I will run into anything weird.


In the regexps, you should probably use [     ]* (tab or space)
instead of just spaces, but I left that out out of laziness :-)

You've lost me. Should the brackets [] be inserted instead of just a 
whitespace in all instances?

Regards,

Dave/Webmaster

    Sick And Tired Of SPAM? | Enjoy Great Humor?
         Join www.cauce.org | Join Joke-L
Hit SPAMMERS Where It Hurts | Listserv <listserv(_at_)ddave(_dot_)com>

           "So many SPAMMERS; so few comets."