procmail
[Top] [All Lists]

Sorting several lists into a common folder -- with a nice pretty name (was Re: Recipe problem)

2004-08-26 23:44:32
Chuck Campbell wrote:
[...] I'm certainly learning more than I thought :-)

I just wish I'd resist the urge to post when tired, but yes it is always
educational!

[...] :0: * 9876543210^0 ^TOlinux-xfs * 9876543210^0 ^TO_linux-xfs * 9876543210^0 ^Sender:.*linux-xfs * 9876543210^0 ^X-list:.*linux-xfs * 9876543210^0 ^FROM_DAEMONlinux-xfs * 9876543210^0 ^From:.*linux-xfs [...] Am I correct in assuming the the first "hit" on any of the conditions would set a score larger than procmail's infinity, and skip the rest?

I'll meekly venture forth a "yes" to that, which provides a nice way to
match on the "best" header to use, rather than simply the last match as my example did. I wrote a rule that may match more than one header in typical (for me) messages (fragment):

:0
* 9876543210^0 ^Delivered-To: mailing list 
\/(_dot_)*(_at_)yahoogroups\(_dot_)com
* 9876543210^0 ^List-Id: +\/.*
* 9876543210^0 ^List-Unsubscribe: \/.*@

By using the "jumbo" value per your suggestion, a message that has
headers matching multiple patterns will set $MATCH using the
"preferred" header. For example, a yahoogroups message yields:

listname(_at_)yahoogroups(_dot_)com     - for the 1st
<mailto:listname-unsubscribe@        - for the 3rd

I much prefer the cleaner result of the 1st, so your suggestion is
ideal. Stop at the 1st (best) match.

Note though that I'm restricting this to using only headers that ONLY appear for lists. Clever matching of combinations of Precedence: Bulk plus To:/From: etc. might help with lists that don't use these.

So now I've got $MATCH set to a meaningful, unique value per list.

I am working out how to "map" from a matched LISTNAME to an appropriate folder name.

Like you, I like to send many lists to a common folder, so I've wanted
to map each list identifier to a folder independent of conventions used
in the matching header. The following (very ugly) approach seems to work:

1. I created a file (~/.listmap) containing lines like:

foldername:list1checksum:list2checksum

2. I use the aforementioned unique identifier for each list to create a
checksum using md5sum (kludgy, but this is just for testing):

        LISTNAME=$MATCH
        LISTHASH=`echo $MATCH | md5sum | cut -f 1 -d " "`

So now I've got a checksum value that is readily matched, independent of wordspacing or punctuation used in the list identifier matched previously. I'm sure there's a faster, better way to derive a unique key for each list identifier string, but this does work.

3. I search for a match for that checksum in the listmap file to
determine a destination folder:

        LISTFOLDER=`grep $LISTHASH ~/listmap | cut -f 1 -d ":"`

4. And I set a sane default if none is found:

        LISTFINAL=`echo ${LISTFOLDER:-default}`

5. By inserting some debugging headers, I can embed the needed checksum value into each list message easily enough, so I just need to cut & paste the appropriate values into ~/listmap:

        :0 fw
        | formail -A "$PROCMAILHEADER List hash is $LISTHASH" \
          -A "$PROCMAILHEADER List final destination is $LISTFINAL"

        :0
        $LISTFINAL

Note, I'm not asking for this, I'm working on it. When I get into trouble, I'll ask for help :-)

Same here, but I like to compare approaches and get feedback. I'm still experimenting, and haven't pushed this into use yet.

The only problem is I am on a few lists that require insertion of specific headers to work well with my mail client of choice, so this may not be satisfactory over the long term. Still, it does seem to provide a way to easily write one procmail rule that can sort a variety of lists into folders with great flexibility.

- Bob

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>