procmail
[Top] [All Lists]

Re: Forcing valid date format in headers?

2001-03-21 14:26:45
Alan Glover wrote:
I get some email with unusual date formats, eg:

Date: Wed Mar 21 07:23:08 AST 2001

I'd like to set up a procmail recipe that spots this, and rewrites
the Date field.

It's probably easiest to use sed for that: (untested)

    :0 D fhw
    * ^Date: ... ... .. ..:..:.. [A-Z]+ ....$
    | sed '/^Date:/s/\(...\) \(...\) \(..\) \(..:..:.. [A-Z]*\) \(....\)$\1, \3 
\2 \5 \4/'

(That's one long line at the end, in case it doesn't survive
transmission. I know, the \1 business for the weekday isn't really
necessary because it's not moved anyway, but this way it looks a bit
clearer to me.) If this doesn't seem safe enough, feel free to replace
the various dots with tighter regexps.

Assuming formail's adding of a Date: header does need the date/time
as a parameter, is there an easy way to read the date from the
system in the right format?

Do you really want the current system date or rather the one from the
existing date line? The latter seems more appropriate to me, and
procmail can be used to extract it bit by bit:

    WEEKDAY_RE="(Mon|T(ue|hu)|Wed|Fri|S(at|un))"
    DAY_RE="(0[1-9]|[12][0-9]|3[01])"
    MONTH_RE="(J(an|u[nl])|Feb|Ma[ry]|A(pr|ug)|Sep|Oct|Nov|Dec)"
    YEAR_RE="(20[0-9][0-9])"    # Y2.1K problem there ;-)
    TIME_RE="(([01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9])"
    TIMEZONE_RE="[A-Z]+"

    :0 D
    * $ ^Date: ${WEEKDAY_RE} ${MONTH_RE} ${DAY_RE} ${TIME_RE} ${TIMEZONE_RE} 
${YEAR_RE}
    {
        :0
        * $ ^Date: \/${WEEKDAY_RE}
        { WEEKDAY=$MATCH }

        :0
        * $ ^Date: .*\/${MONTH_RE}
        { MONTH=$MATCH }

        :0
        * $ ^Date: .*\/${DAY_RE}
        { DAY=$MATCH }

        :0
        * $ ^Date: .*\/${TIME_RE}
        { TIME=$MATCH }

        :0 D
        * $ ^Date: .*\/${TIMEZONE_RE}
        { TIMEZONE=$MATCH }

        :0
        * $ ^Date: .*\/${YEAR_RE}
        { YEAR=$MATCH }

        :0 fhw
        | formail -I "Date: $WEEKDAY, $DAY $MONTH $YEAR $TIME $TIMEZONE"
    }

Again, untested.        

I'm not even sure this is possible with procmail's regexps, but I
was thinking of matching on the AST, which I know to be -0400, and
then re-ordering the time/date parameters into the standard layout
of Wed, 21 Mar 2001 11:36:48 -0400

Modifying both suggestions to match only "AST" and to use "-0400" when
replacing is left to the reader as an exercise. :-)

/HW
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>