mhonarc-users

Re: msgid instead of seq. number for output files

1998-07-30 08:22:14
So before I put it on top of my TODO list ...

Has anybody tried to patch mhonarc to use the msgid for the name of
file instead of the sequential numbering?  So files

Earl, does it make sense to make this bigger extention for mhonarc v2?
Or should one better wait and hack an alpha of mhonarc v3 (whenever
this will be).

Ah, before I get asked: If no message-id is given mhonarc should/will
create it's own id on the fly as it does now.

If it does this (and it should do it now for the duplicate message checking),
checks should be made for RFC-compliant Message-Id: headers.  A lot of 
messages that I get from misconfigured relays don't send unique Message-Ids,
therefore breaking the duplicate message checking.

It may be safer to try matching on the message body instead -- you'll 
notice that the messages to this list currently have 3 extra md5 related
headers in them.  All but one will go away (I've basically been doing
experiments), but they essentially provide an md5sum of the message
body.  I use it to filter duplicate messages and incoming spam.

The code is done at the SMTP level in sendmail, but it's equally easy
to write a procmail recipe to insert this header.  I assume that it
would be trivial in perl as well using the md5 module.

Of course, not everyone would like to use md5sums for this.  It could
really slow things down if you were adding 10000 messages to the archive
and needed to calculate a sum for each one.  So what about the possibility
of choosing which header you want to use for duplicate checking?  Is
that easy to create a resource for?  Is it extensible to Achim's 
suggestion?

Chris