procmail
[Top] [All Lists]

Re: Problem extracting from the from_ header

2002-07-08 20:51:47
Alan Clifford <lists(_at_)clifford(_dot_)ac> writes:
Having looked at RFC 2822 and, as Pine allows me to do it, I think it is
legal, I send myself an email from my test account, from "@ @"@clifford.ac

So the actual from_ header is:

 From "@ @"@clifford.ac  Mon Jul  8 23:28:43 2002

I assume it is the space that is the cause of my problems.  As I
understand it, the quoted local part of an address could include CRLF as
well.  Problem 1 below, concerning the date is the most important to me.
And I am totally at a loss with problem 4 if the local part contained a
newline.

Before worrying about that, first figure out whether your MTA accepts an
SMTP MAIL From: command with an address that contains a CR or LF.  I doubt
it does, in which case you'll never have to worry about seeing one.


Problem 1:

To extract the date, I am using:

#### The From_ date ###################
# remove the variable
FROM_SECS
: 0
* ^^From [^     ]+[     ]+\/.*
{
  FROM_SECS=`date --date "$MATCH" +%s`
}
# check for missing or bad date outside the recipe
FROM_SECS=${FROM_SECS:-"Bad date"}
######################################

The procmail log shows:

 date: invalid date `@"@clifford.ac  Mon Jul  8 23:28:43 2002'

So fix the regexp to explicitly match the valid date syntax, anchored
against the end of the line.  Untest:

        FROM_SECS
        :0
        * ^^From .*\/[MTWFS][a-z][a-z] [JFMASOND][a-z][a-z] [ 123][0-9] \
                        [0-2][0-9]:[0-5][0-9]:[0-6][0-9] 2[0-9][0-9][0-9]$
        {
                FROM_SECS=`date --date "$MATCH" +%s`
        }

That's actually more restrictive than needed.  You could even just
extract the last 24 or so characters on the line and feed that to date:

        * ^^From .*\/.........................$


Problem 2

I am not using the address from the from_ header at the moment, but using:

#### The From_ header ###################
: 0
* ^^From[      ]+\/[^  ]+
{ FROM_HEADER=$MATCH }

LOG="${NL}From_: ${FROM_HEADER}${NL}"
#########################################


I get a log of

 From_: "@

Same basic problem, but a slightly trickier solution is required because
you can't trim characters off the right side as easily.  Probably the
simplest is to count on the fact that procmail puts two spaces between
the address and that the date never contains two spaces followed by
a non-digit.  So:

        :0
        * ^^From \/.*  [^0-9]
        * MATCH ?? ^^\/.*  ()
        * MATCH ?? ^^\/.*[^ ]
        { FROM_ADDR = $MATCH }

The second condition strips of the trailing non-digit matched by the
first condition.  The third condition then strips off the trailing
spaces.  (There's an old version of procmail that doesn't properly
support extracting from MATCH itself, so don't try this with anything
older than 3.11pre4 or so.)

HOWEVER, depending on what you need the address for, you may be better off
using the value of the Return-Path: header field, if your MTA provides it.

        :0
        * ^Return-Path: \/.*
        { FROM_ADDR = $MATCH }


Problem 3

Formail fails, presumably when it tries to create the reply header.  Or
mybe it is sendmail.  But it seems to recover from this and sends off the
autoreply, so not too much of a problem.
...
procmail: Program failure (67) of " (formail -rtz -A"X-Loop:
...

The exit code of a pipeline is the exit code of the final program in
the pipeline, so the EX_USER (67) error is from sendmail.


Problem 4

Before the autoreply is sent, a check is made against the "grey" list to
see if an autoreply has been sent recently.

The greylist is a text file in this form:

custserv(_at_)alldomains(_dot_)com
"@ @"@clifford.ac
"@ @"@clifford.ac

The check for the grey list is:

* $ ! ? echo ${FROMHEADER} | grep -F -isx -f $PMDIR/list.grey

DO NOT PUT '$' BEFORE '?' IN COMMAND CONDITIONS.  The variables are
being expanded by procmail because of the '$' special on the condition
and then the shell sees the results and parses that.  So, procmail
expands the varibles to get

        echo "@ @"@clifford.ac | grep -F -isx -f /path/to/that/file/list.grey

the shell then treats those double quotes specially and the echo command
only sees the single argument (ignoring the leading tab)
        @ @@clifford.ac


The correct condition drops the '$' and puts the variable in double
quotes so that echo only sees a single argument:

        * ! ? echo "${FROMHEADER}" | grep -F -isx -f $PMDIR/list.grey


Philip Guenther
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>