procmail
[Top] [All Lists]

Re: Procmail experiments -- good methods

1999-05-16 02:21:22
On 15 May 1999 05:25:22 -0700, Harry Putnam <reader(_at_)newsguy(_dot_)com>
wrote:
Original headers start immediately below the ^BEGIN line but are not
in unix format.  The "From " line and "Return-Path" lines are missing.

Are you sure you need them? Both? (Usually the contents of Return-Path
are derived at delivery from the From_ line.)

Actually I was under the impression that MH didn't use From_ lines at
all, but since I don't use MH myself, this should definitely be taken
with a grain of salt.

##Remove stuff added by archive server
:0fhbwc
* Subject: archive retrieval: latest/[0-9]+
| sed  -e '1,/BEGIN------------cut here-------------/d' \
       -e '/END--------------cut here-------------/,$d' \
##Replace unix message format ('^From ' and '^Return-Path: ') lines 
|awk 'BEGIN {"date"|getline;print "From 
apollo-list-request(_at_)redhat(_dot_)com", $0 "\nReturn-Path: 
<apollo-list-request(_at_)redhat(_dot_)com"};{print}'

:f and :c are usually not meaningful at the same time. Take out the c
flag is my advice.

(The comments in the action part are of course not part of the real
live recipe you're using, correct?)

Since the significance of the date on the From_ line is pretty much
zero here (if it isn't, how about you derive it from the Date: header
of the digested message [a very imprecise science] instead of the time
when you happened to process it?), you might as well fill it in with a
bogus date instead. That means you can use the same sed script for all
your processing. (Just put in something like Thu Jan 01 00:00:00 1970.
Or if you need a real date, how about produce all the headers you need
with date(1) -- something like

    date -d "whatever" +'From apollo-list-request(_at_)redhat(_dot_)com  %c
Return-Path: <apollo-list-request(_at_)redhat(_dot_)com>'

I think there's usually two spaces before the date stamp. On my
system, date %c returns a date string with a time zone indicator,
whereas the From_ lines lack the time zone part. You can of course
construct a better date format using date(1)'s primitives; man date. 
... I also happened to notice you're missing a close bracket on the
Return-Path.)

This is working like I wanted but seems slow.  I suspect I'm
making procmail do more work than is really needed.  The incoming
messages total about 7MB and takes a couple of minutes to process.

Procmail isn't exactly blazingly fast, in my experience. Getting rid
of the extra awk script should help some, but probably not make a
world of a difference. How long does it take if you just deliver those
to /dev/null with no processing whatsoever?

/* era */

-- 
.obBotBait: It shouldn't even matter whether     <http://www.iki.fi/era/>
I am a resident of the state of Washington. <http://members.xoom.com/procmail/>
 * Sign the European spam petition! <http://www.politik-digital.de/spam/en/> *