procmail
[Top] [All Lists]

Re: Editing text of messages via procmail

2001-11-19 17:38:20
At 16:09 2001-11-19 -0600, Tim Roberts wrote:
I'm rather new to procmail and to Unix in general.
I subscribe to several mailing lists. Most of which, deliver in a digest format.
They always have a lot of administrative info before "Today's Topics".

Find something which positivelt delimits the administrative info -- if it is from the top of the body to the "Today's Topics" identifier, then so be it.

I cobbled the following together and tested against the text at the top of an available message: you might try taking this and tweaking it (be sure to see the material about test configuration at the URL in my .sig)

This is more or less a bastardized version of something David wrote (see below), but I've quickly hacked this to cut stuff from the top of a message, rather than the bottom.

:0
* ^From:.*somelist_or_whatever_condition
{
        STRING="-----BEGIN PGP"         # do not include opening left anchor
        DIVIDER=1                       # to remove additional lines appearing
                                        # BELOW $STRING as well.

        # copy the footer into $MATCH
        :0B
        * $ ^()\/$STRING(.*$)*^^
        {
        }

:0Bbfwi # do not use `r' flag here if you will be saving in mbox format!
        * 1^1 MATCH ?? ^.*$
        * $ -${DIVIDER:+2}^0
        | tail -$=
}

If you plan to use this, i would suggest throughly testing it first -- I haven't tested it other than throwing a few quick messages at it to see that it does chop text from the tops. THIS FILTER IS NOT SOMETHING I USE, so don't rely on it being good because someone posted it for you...

i'd be grateful for any suggestions

The list archives are accessible through a link at <http://www.procmail.org>. You should search them for terms related to your inquiry.

David Tamkin posted a rather nice freemail (yahoo|hotmail|etc) adbanner stripper some years back. That's what the above is loosely based on, except it's in reverse (the above trims the TOP of the message out, not the bottom). You'll need to pipe to a sed script to eliminate adbanners in the midst of messages. I haven't done a lot of that because I found several lists changed formats just infrequently enough to cause grief.

---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>