procmail
[Top] [All Lists]

Re: (Parsing dates and) persistent lock files that never go away

2004-07-24 19:38:47
On Fri, 23 Jul 2004 19:02:33 -0600, Justin Gombos 
<mindfuq(_at_)zianet(_dot_)com> wrote:
But much of the old email that I'm refiltering has
corrupt From_ fields.  Some of it has no From_ field at all (because
it may have been delivered as a digest or my old MUA didn't bother to
save it).  Then I also have a stack of messages that have gone through
my defective scripts, and the From_ field has been stomped on with
LINEBUF overflows, and I deleted the original mailboxes before I
discovered this was happening.

I have the following date_fix.inc file (Which was a collaborative
effort from several people here, including some minor fixes/features
by me.  I thin you can find the threads by searching for "WHICHRECVD" 
I think the original 'bones' of the script were generated by Don
Hammond

##START----
WEEKDAYS = '(S(un|at)|Mon|T(ue|hu)|Wed|Fri)'
MONTHS = '(J(an|u[ln])|Feb|Ma[ry]|A(pr|ug)|Sep|Oct|Nov|Dec)'
WHICHRECVD = 'by [^     ]*(mail.)?covisp.net'
YEARS = '(199[0-9]|20[0-9][0-9])'
TIMESTAMP = '([01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9]'
# dates between 1990-2099 ok. Adjust following as needed.
# NB: DATE is only trusted from the WHICHRECVD server.

# proper Received header, should work 100% of the time.
#RCVD_STAMP = "$WEEKDAYS, [0-9]+ $MONTHS $YEARS $TIMESTAMP"

# improper Received header, handles "Tue, 01" and "Tue,  1"
# ONLY use this version if your mailspool includes messages
# with improper headers on the WHICHRECVD line
RCVD_STAMP = "$WEEKDAYS, +[0-9]+ $MONTHS $YEARS $TIMESTAMP"

# Get the received header I want the date from
# (when the message hit my mailsserver)
:0
* $ ^Received:.*$WHICHRECVD.*\/$RCVD_STAMP
{ xDATE = "$MATCH" }

# short-circuit w/o data
:0fw
* xDATE ?? ^^^^
| formail -i"X-Date-Munge: FAILED"

# If we didn't bail out early...
:0 E
{
   # Build the new "From " header by extracting the date in the
   # right order from xDate

   # From: Sun, 02 Mar 2003 08:29:12
   # To:   Sun Mar 2 08:29:12 2003
   :0
   * $ xDATE ?? ^^\/$WEEKDAYS
   { ENVELOPE = "<$CLEANFROM>  $MATCH" }
   :0
   * $ xDATE ?? ^^$WEEKDAYS, +[0-9]+ \/$MONTHS
   { ENVELOPE = "$ENVELOPE $MATCH"
     MYMONTH=$MATCH
   }
   :0
   * $ xDATE ?? ^^$WEEKDAYS, +\/[0-9]+
#   { ENVELOPE = "$ENVELOPE $MATCH" }
# If you are using the broken RCVD_STAMP Comment out 
# the previous line and uncomment this block
   {
         PADD = "0"$MATCH
         :0
         * PADD ?? ...
         {
            :0
            * PADD ?? ^^.\/..
            { PADD = $MATCH }
         }
      ENVELOPE = "$ENVELOPE $PADD"
   }

   :0
   * $ xDATE ?? ()\/$TIMESTAMP
   { ENVELOPE = "$ENVELOPE $MATCH" }

   :0
   * $ xDATE ?? $MONTHS \/$YEARS
   { ENVELOPE = "$ENVELOPE $MATCH"
     MYYEAR=$MATCH
   }

   # Make sure the $ENVELOPE matches the desired format
   # If it does, rewrite the From_
   :0 fhw
   * $ ENVELOPE ?? $WEEKDAYS $MONTHS [0-9]+ $TIMESTAMP $YEARS^^
   | sed "s,^\(From \).*,\1$ENVELOPE,"
       # And add a header showing we've altered the message
       :0 afw
       | formail -i"X-Date-Munge: SUCCESS"

       # Otherwise, show that the attempt failed.
       :0 efw
       | formail -i"X-Date-Munge: FAILED"


   MONTHSTRING=Jan01Feb02Mar03Apr04May05Jun06Jul07Aug08Sep09Oct10Nov11Dec12

   :0
   * $ MONTHSTRING ?? $MYMONTH\/..
   { MYMONTH=$MATCH }

   MYDATE=$MYYEAR-$MYMONTH

   # If the message is not from the current month and year, mark it read
   :0fw
   * 9876543210^0 ! $MYYEAR ?? $CYEAR
   * 9876543210^0 ! $MYMONTH ?? $CMONTH
   | formail -I"Status: RO"

}
##END----

_ALL_ My mail goes through it, so I never get messages that show up
with 2008 or 1969 dates.  Now, it does not reconstruct the FROM_
because I had no need to do that, but it does correct the date/time in
the FROM_ header

In all of the above cases, Formail constructs a new From_ line with
the current date.  It would be nice if I could tell Formail to trust
the Date field,

No, the Date: filed is wholly untrustable.

-- 
gkreme at gmail or kreme at kreme or syth at mac

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail