[Don't read this, Eli; it will anger you.]
Michael Helm asked,
| I'm trying to deal with a mailing list that has some subscribers
| with problem user mail agents. These prefix the subject with
| various junk strings that trip up threading & sorting in conventional
| mailers, eg Re[8]: my subject or Sv: My subject or FW: stuff
| What's worse is that they combine like a genetics experiment:
| Often see Re: Re[2]: Sv: FW: nothing
| & similarly ugly combinations.
| I can get rid of these junk strings separately, but what I want to
| do is process an incoming message until they are *all* gone. ...
| What I'd like to do is process the messages recursively until
| they stop matching these rules, & then move on, but I've never been
| able to figure out how to get procmail to do that. I'd prefer not
| to overload my brain with even more complex regular expression stuff,
| gets very difficult to understand or change. What can I do?
|
| Any suggestions appreciated.
Michael said the magic word: recursion. Because, as Era Eriksson explained
to Martin Ramsch, procmail is right-side-greedy and left-side-stingy in
assigning MATCH, we can't do the simple thing [well, we can when there are
no colons in the significant part of the subject:
* ^Subject:.*\/[^:]+
but it isn't easy to guarantee that]. So, here goes (untested) -- put this
into your main rcfile:
# If one of the prefixes is Re: or an equivalent, we want to end with one Re:.
:0 # caret, asterisk, and second left bracket are literal
* ^Subject:(.*\>)?\/Re[[*^:].*
{ SUBJECT=$MATCH FOUND_A_RE=yes INCLUDERC=/path/to/.stripsubjectrc }
# A relative path is also acceptable; it will be assumed to start from
# $MAILDIR.
:0E # Otherwise, we want to end with no prefix at all.
* ^Subject:(.*\>)?\/(FW|Sv):.*
{ SUBJECT=$MATCH FOUND_A_RE INCLUDERC=/path/to/.stripsubjectrc }
Now, .stripsubjectrc should be a separate file, looking something like this:
:0fwh # Did the last recursion finish the job? Then do the fix and return.
* ! SUBJECT ?? ^^(FW|Sv|Re(\[[0-9]*]|\^[0-9]*)?):[ ]*\/[^ ].*$
| formail -I"Subject: ${FOUND_A_RE:+Re: }$SUBJECT"
:0E # $MATCH is now one prefix shorter, so try again.
{ SUBJECT=$MATCH INCLUDERC=$_ }
Recursion depth is limited by the number of file descriptors your kernel
will allow, but that shouldn't be a problem if the prefixes are not allowed
to build up in the first place.