procmail
[Top] [All Lists]

formail -D generates empty msg. when duplicate found?

1999-12-11 00:33:40

Hello, I'm trying to run a script that takes a mail box and
filters out duplicate messages that appear in the original.

formail -D 100000 id.cache -s procmail -m script.rc < mbox > filtered_mbox

where id.cache is used to search for duplicate message ids. and script.prc
is a procmail script that does some further filtering on the output.

What I see happening (v3.14, also 3.10) is that formail does detect
duplicate messages.  However, when it finds a duplicate, its ends
up tradting it as an empty message which it prepends a
'From foo(_at_)bar' to and then passes this message onto procamil,
and the script.rc script.  I think it should just ignore the
duplicate e-mail and move on.   Is this a bug?

The workaround is simple, in script.rc, it check for, and discard
the bogues message:

#
# if the message id is duplicated, formail passes in an
# empty email body, with a 1 line "From foo(_at_)bar'.  This
# is a bug, I think - the workaround is to check, and
# just dump the message.  Hoever, this won't work well
# for mal-formed mail boxes.
#
:0 D
* ^From foo(_at_)bar
/dev/null

but this way of handlig things runs into problems, if for example,
formail adds the dummy 'From foo(_at_)bar' line is added by formail
for other reasons.

PS: are there any guidelines on sizing the message id cache?