With dman's help, I've written the following code to extract the year
from a message. Here's what I have so far:
ARCHIVING=true
# Extract the year from the date that the author
# *claims* to have composed the message.
#
# Note: this recipe will result in a null value
# for dates formed as XX-XX-XX.
#
:0
* ^Date:.*\/(19|20)?[0-9][0-9][^a-z]+:
* MATCH ?? ^^\/[^ ]+
* 19^0 MATCH ?? ^^19..^^
* 20^0 MATCH ?? ^^20..^^
* 19^0 MATCH ?? ^^[^0].^^
* 20^0 MATCH ?? ^^0.^^
* MATCH ?? ^^.*(19|20)?\/[0-9][0-9]^^
{ STATED_YEAR = $=$MATCH }
# Alternative approach to extracting year from Date: field
#
#STATED_YEAR=`formail -x "Date: " | sed -e 's/.*
\([12]\{,1\}[90]\{,1\}[0-9][0-9]\) .*/\1/' \
# -e 's/^[^0][0-9]$/19&/g' \
# -e 's/^[0][0-9]$/20&/g'`
# Extract the year from the date that the
# message was delivered to the last server.
#
:0
* ^^From .*\/(19|20)[0-9][0-9]$
{ RECEIVED_YEAR = $MATCH }
# If processing mail that is missing time stamps,
# trust the authors given date more.
#
# (This is to circumvent Formail's inclusion the date
# of processing in the absense of a From_ field.)
#
:0
* ARCHIVING ?? true
{ SELECTED_YEAR = ${STATED_YEAR:-$RECEIVED_YEAR} }
# If processing new inbound non-digest mail,
# trust the servers date more.
#
:0 E
{ SELECTED_YEAR = ${RECEIVED_YEAR:-$STATED_YEAR} }
# The $SELECTED_YEAR variable can be used to
# organize archives into more reasonable
# sized files or folders.
____________________________________________________________
procmail mailing list Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail