procmail
[Top] [All Lists]

Re: getting email addresses

1996-10-02 03:31:47
On Tue, 1 Oct 1996 15:42:36 -0800,
"Simeon ben Nevel" <Simeon(_dot_)Nevel(_at_)Schwab(_dot_)Com> wrote:
On  1 Oct 96, procmail(_at_)Informatik(_dot_)RWTH-Aach  wrote:
<... Simeon, your quoting seems a bit screwed up here ...>
I find myself in need of scanning all of my elm mail folders for
valid email addresses, and compiling a list of them in an ascii
concatenated) you can probably use a tool like grep or sed to much greater 
effect than procmail.

Well, I can see how one might want the added security of being sure
that anything that is culled is actually culled from actual headers. 

  formail -s formail -rzx"To:" < Mail/inbox | sort -u

should get you to a start. The first ("outer") formail just splits the
mailbox into individual messages, and runs another formail -rzxTo: on
each extracted message. 
  You will want to manually inspect the output and/or make some kind
of script to alert you to very similar user ID:s, so as to avoid
sending to the same person at slightly different addresses. 

As an example, I have a few mails from myself from different domains
in my (RMAIL aka BABYL format, that's why I need formail -B) inbox:

 % formail -Bs formail -rzx"To:" < ~/Mail/RMAIL | sort -u | \
        cut -d'@' -f1 | uniq -d
 reriksso

You could construct something very similar to look for people from the
same domain (like, fnl(_at_)some(_dot_)domain(_dot_)foo and 
First(_dot_)N(_dot_)Lastame(_at_)domain(_dot_)foo
might perhaps be the same person).

 % formail -Bs formail -rzx"To:" < ~/Mail/RMAIL | rev | sort -u | \
        cut -d'.' -f1-2 | uniq -d | rev
 helsinki.fi

Hope this helps,

/* era */

(I do believe Elm folders use standard Unix mbox format, so formail
should be able to cope with them directly without any hassle.)

-- 
See <http://www.ling.helsinki.fi/~reriksso/> for mantra, disclaimer, etc.
* If you enjoy getting spam, I'd appreciate it if you'd register yourself
  at the following URL:  <http://www.ling.helsinki.fi/~reriksso/spam.html>

<Prev in Thread] Current Thread [Next in Thread>