procmail
[Top] [All Lists]

Fwd: How to delete "duplicate mails" (in Outlook)

2002-07-10 00:50:20
1. Be advised that the VERSION of formail has a (big) effect on 
   duplicate-message-id email filtering (as shown below).

# Remove duplicate mails (older formail)
:0 Wh: msgid.lock 
| (/bin/sed -e 's/^Message-ID:  /Message-Id: /') | 
/usr/local/bin/formail \
-D 8192 $PMDIR/msgid.cache
# traps duplicateg message ids after stabilizing
# case and leading white spaces. :(

vs
# Remove duplicate mails (newer formail)
:0 Wh: duplicates.lock
* ?formail -D 65536 msgid.cache
duplicates
# traps duplicate message ids in a 64kb cache regardless 
# of case or leading white spaces. :-)

2. AFAIK, these duplicate-message-id's are caused in an environment
   of Exchange servers and listserv/majordomo lists.

3. BTW, it drives the Outlook/Exchange users batty; but not
   (thankfully), the sendmail users (who employ procmail).

jjg

...
The question in this thread is, can the duplicate problem 
also be solved by outlook/exchange?
...
Date: Tue, 09 Jul 2002 19:18:04 +0100
To: Mihai T. Lazaarescu", Cadence_procmail_users
From: Andrew Beeckett 
Subject: Re: FW: How to delete "duplicate mails" (in Outlook)

Hi Mihai,

Ah that would explain it. My formail seems to be ancient (probably
something like 1994!). The recipe that I use was one that was
thrown together in a hurry and handles the current variants that
appear, rather than the Roll's Royce solution which would
handle all cases. As I've found, it's the leading spaces
which are the problem, not the case anyway, so sed would be
fine for that (although the pattern should be smarter to allow
a variable number of spaces).

Thanks for the input!

Regards,

Andrew.

At 07:58 PM 7/9/2002 +0200, Mihai T. Lazaarescu wrote:
Andrew,

I guess it's probably a different formail version you are using.
Mine (formail v3.22 2001/09/10) is ignoring both the case and
the leading white spaces:

formail -D 65536 cucu.cache <<EOF
Message-ID: 1234
EOF
echo $?
1

formail -D 65536 cucu.cache <<EOF
MeSSaGe-Id:                    1234
EOF
echo $?
0

Of course this is of no use for Outlook users, but it can lighten
up the procmail rules and processing for sendmail users.

Moreover, AFAIK, sed cannot match patterns case insensitive.
I'd suggest to replace it with gawk: gawk --assign IGNORECASE=1
'/^message-id: / {print $1, $2}' | formail -D... for those
whose version of formail is not able to zap the leading spaces.

Regards,

Mihai

On Tue, 9 Jul 2002, Andrew Beeckett wrote:

Mihai,

First of all, your recipe is not in essence any different from mine
(more below).

Secondly, it does not solve the problem for outlook users, which
is really what the issue is; we _know_ how to solve it for
sendmail-based email users.

Anyway, more on the recipe. I was surprised about what you said
about case, so I did some experiments. I found that actually
the problem was not to do with the change of case that listserv
introduces, but the insertion of an additional space before the
message id itself.

For example:

formail -D 65536 /tmp/stuff.cache <<EOF
Message-ID: 1234
EOF
echo $status
1

formail -D 65536 /tmp/stuff.cache <<EOF
Message-Id: 1234
EOF
echo $status
0

(indicates that the message should be deleted, or whatever).

Now:

formail -D 65536 /tmp/stuff.cache <<EOF
Message-ID:  1234
EOF
echo $status
1

So it is the additional space before the number that causes
the problem. My rule with the sed in it removes this extra space,
and hence works. Your recipe doesn't and hence would not cope
with the listserv induced problems (it will work fine provided
that all the duplicates were delivered by listserv, or
if all were delivered by some combination of traditional
aliases and majordomo, but not if listserv + aliases/majordomo).

Still, none of this solves the problem for outlook users,
which is what the issue being raised here was.

Regards,

Andrew.

At 12:53 PM 7/9/2002 +0200, Mihai T. Lazaarescu wrote:
This rule:

    :0 Wh: duplicates.lock
    * ?formail -D 65536 msgid.cache
    duplicates

traps duplicate message ids in a 64kb cache regardless of case
or leading white spaces. :-)

Cheers,

Mihai

On Mon, 8 Jul 2002, Andrew Beeckett wrote:

Joe,

We know the reason for it; this has been a problem for
a long time (both with listserv and majordomo). It's quite
understandable (to me) why it happens (others are less
tolerant ;-> ).

However, with sendmail-based email, we had a solution; you
can use procmail to delete the duplicates before it even
ends up in your inbox, using a procmail rule such as:

# Remove duplicate mails
#
:0 Wh: msgid.lock
| (/bin/sed -e 's/^Message-ID:  /Message-Id: /') | 
/usr/local/bin/formail \
         -D 8192 $PMDIR/msgid.cache

(Note, the case conversion stuff is to handle Listserv, which changes
the case of the message-id bit).

The question in this thread is, can the duplicate problem also be
solved by outlook/exchange?

Regards,

Andrew.

At 02:13 PM 7/8/2002 -0700, Joe Dexon wrote:
The answer to the question of duplicate mails asked is explained 
by Kaijun
Zhaan below. I apologize in advance to the broad distribution of names
within the listed aliases. I will not mass email you on this issue 
in the
future.

Joe

-----Original Message-----
From: Kaijun Zhaan
Sent: Monday, July 08, 2002 1:26 PM
To: Joe Dexon
Subject: RE: How to delete "duplicate mails" (in Outlook)

Joe - it is because the user is in aliases which is on Listserv. 
If one
email is sent to
3 aliases, one is on Exchange and two on Listserv, then the 
orignal mail
arrived at Exchange, which first deliver the mail to the user 
(since one
aliases is on Exchange), then sends it to listserv. Listserv send each
mail individually, thus
get multiple copies. thx, Kaijun

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>
  • Fwd: How to delete "duplicate mails" (in Outlook), John Gianni <=