procmail
[Top] [All Lists]

Re: improvements on recipe solicted

1998-06-07 06:19:42
At 10:04 PM 6/6/98 +0100, mark david mcCreary wrote:
I would like to improve the following recipe


:0 hw
* ^Subject:.*unsubscribe \/.*
{
  LOCAL=$MATCH
  :0
  * LOCAL ?? (_dot_)*(_at_)\/.*

Since you don't seem to use $MATCH again, why not just:
  * LOCAL ?? @

  {
     SENDER=$LOCAL
  }
}

And if you have something without "@" in it, what is $SENDER?
What do you *want* it to be for "unsubscribe fred"?


This checks to see if the Subject line is attempting to unsubscribe some
email address, presumably a different email address than in the from line.
It puts the email address in the variable SENDER, where it is then used in
SENDMAIL -f $SENDER.

Assuming the input is

      Subject: unsubscribe joe(_at_)aol(_dot_)com

this works just fine.

If they get fancy, and try to put in two email addresses, the whole string
after unsubscribe gets put in SENDER.

      Subject: unsubscribe joe(_at_)aol(_dot_)com 
sam(_at_)compuserve(_dot_)com

And SENDER = joe(_at_)aol(_dot_)com sam(_at_)compuserve(_dot_)com, which then 
bombs SENDMAIL.

I would be happy just picking up the first email address, and ignoring
anything else.

If I try changing the regular expression to end with a space, it only works
for the second case.

:0 hw
* ^Subject:.*unsubscribe \/.* ()

and bombs on the more normal unsubscribe joe(_at_)aol(_dot_)com(_dot_)

This try was innovative, but fails if there is more one space, as the
regular expression is greedy.

Try:
:0 hw
* ^Subject:[    ]*unsubscribe[  ].*\/[^         <]+(_at_)[^     >]+

(untested).  This should extract the first "word" containing an
embedded "@", stripping <>, so should also work with such as:
        Subject: unsubscribe Joe Schmoe <joe(_at_)aol(_dot_)com>
Note each [] contain a space and a tab (and sometimes other chars).

[snip]

Awk seems like a good choice, as it can easily parse between spaces, and
return the first word of a phrase. However, my attempt at invoking awk has
failed.

:0 hw
* ^Subject:.*unsubscribe \/.*
{
  saved = $SHELLMETAS
  SHELLMETAS
  LOCAL= `echo "$MATCH" | awk 'print {$1}'`

Well, the awk program is incorrect; that should be
        awk '{print $1}'
but (see above) you shouldn't need awk here.
  :0
  * LOCAL ?? (_dot_)*(_at_)\/.*
  {
     SENDER=$LOCAL
  }
  SHELLMETAS = $saved
}


So, I am all eyes, if somebody can point out how to improve on my original
receipe to handle the .0001% of the time when people put more than one
address in the Subject line. My guess is that a really good regular
expression, instead of just .*,  would negate the need for any follow up
programs to pull out the first address.

You could also handle multiple addresses by preceding the above recipe
with something like:
:0h
^Subject:(_dot_)*unsubscribe(_dot_)*(_at_)(_dot_)*@
{ issue error for multiple addresses here }

Hope this helps,
Stan

<Prev in Thread] Current Thread [Next in Thread>