# The disadvantage with this method is its complexity: an additional
# "pending" file is needed, and there is a small performance hit for
# each mail, in order to maintain the pending file.
#
# The advantage of this method is its completeness: all duplicates
# will still be detected and dropped.
#
# Both methods require the use of the TRAP command; they set additional
# commands into the TRAP variable. If the TRAP command has already been
# set, the new commands are added to the list of commands.
As a non-expert, I'm asking rather than recommending--what would be wrong with:
1. Store in $DUP_ID whether formail -D #### ID.cache is duplicate.
2. Move Message-Id: <xxxxx(_at_)yyyyy> to Orig-Message-Id: <xxxxx(_at_)yyyyy>
3. Do the white space/signature/etc. filtering and compute the checksum
"asdfghjk" (this is where I'm skeptical of claims of "completeness"--I
don't think it's possible to automate the discarding of every kind of
perverse variation people might come up with, e.g., quoting methods,
signature formats, conversion of Macintosh fonts into ISOLatin1, etc.)
4. Add Message-Id: <asdfghjk(_at_)MD5>
5. Store in $DUP_MD5 whether formail -D #### MD5.cache is duplicate
6. Move Message-Id: <asdfghjk(_at_)MD5> to X-Checksum: <asdfghjk(_at_)MD5>
7. Put Message-Id: <xxxxx(_at_)yyyyy> back the way it was.
8. Do whatever you want with the message based on the values of $DUP_ID &
$DUP_MD5