procmail
[Top] [All Lists]

Re: HELP: What headers to watch for spam (was Re: anybody see this spam?)

1997-05-08 10:35:00
(feel free to forward into the SPAM-L list - I have yet to receive my
subscription confirmation).

Spam is a big issue with me, as I'm sure it is with many others.
Additional RELIABLE tools for reducing spam traffic into my mailbox is
important.  If I'm beating a dead issue here, feel free to let me know.

At 10:40 AM 5/8/97 +0300, era eriksson wrote thusly:
On Thu, 8 May 1997 10:08:20 +0300 (EET DST), I wrote:
<... much snipped ...>
X-Uidl: 9876543fgt9821376nb0988mm632xx321
The observation that spam often contains X-Uidl headers was
interesting. I haven't been paying attention to this. 

Clarification - it isn't so much that spam often contains the header (a lot
of spam doesn't), but that when the header is present, it is often spam.

(This was in reference to an earlier message on Procmail-L about how
incoming spam seems to have this header because it will crash some
mail clients if you try to forward the message or reply to it. If you

The general reactions I've seen are:  Eudora fails to mark a folder as
having new, unread messages in it if the only message(s) in it contains
x-uidl (although the message itself is marked unread) and Pegasus, when
downloading headers only, fails to delete the message from the server when
you mark it for deletion - you actually have to connect to the POP server
with Telnet and manually issue the DELE commands.  This latter case is what
I suspect is the primary reason for having the X-UIDL header - the spam has
the potential to stay in the person's mailbox for a long time this way...

use POP yourself, your own POP program will probably be adding this
header and in that case you can't, obviously, use that as a criterion
for spam matching.)

Probably is a bit harsh.  I'd say before deciding to use this, one should
determine if at the point they go to filter, the header is always present
or not in their environment.  This doesn't appear to always be the case
(and for POP mail clients, Eudora is arguably one of the more commonly used
-- and it doesn't add such a header, even though I *DO* use UIDL for
reading between home/office).  Some folk have mentioned that Netscape mail
has an X-UIDL header, which appears to be added AFTER the message has been
receieved (I don't appreciate them doing it this way - the mailer is adding
headers that weren't part of the original SENT message - they should be
adding the extra data as a non-visible field).  I believe Pegasus adds it
to the mailbox file, but (and I'm not positive here) has it preceeding the
main header - separated by a blank line or two.  This would imply (to me at
least), that Pegasus doesn't treat it as part of the header - just as
message-associated data (the data also includes message size).

Either way, since the filtering we are discussing here is being done at the
PROCMAIL level, we are presumably pre- POP client software (is there a
Procmail-compatible POP3 MUA?).  My suggestion is, take a look at all your
saved email and examine it for characteristics in the spam (I manually file
all spam that gets by my defences just so at a later date, I can examine it
and improve the filtering approach).  Mileage may vary -- people should be
familiar with their operating environments - if your mail service suddenly
starts adding X-UIDL headers in the mailbox, you could be in a world of hurt.

As a consequence of this discussion, I scanned ALL messages in 61MB of
archived mail (about three months, attachments removed) for X-UIDL.  The
results?  I have three messages INSIDE a digested list which contain the
header (as well as an "X-Mozilla-Status" header - which doesn't appear in
the spam messages, so it _could_ then be employed to unmark it as spam in a
recipe).  There is one non-spam discussion message in procmail employing it
(and *THAT* is a quoted message - not the original, so I suspect someones
mail client added it, then they forwarded THAT into the list).

Note that since I file the messages as spam based on X-UIDL appearing in
the HEADER, I don't get caught up with messages discussing something in the
BODY - I haven't been bitten by doing this, since as I've said - EVERY
message I've received with an X-UIDL header has been spam (the three
messages in the Digest are effectively part of the BODY - and there is the
x-mozilla workaround there anyway).

Fighting spam is an empirical process.  You need to examine things for
apparently unique characteristics, then check that those characteristics
don't apparently interfere with other, vaild messages.  In the case of
using Procmail to filter - check what PROCMAIL sees, not what your mail
client software sees after procmail has handled it.

I encourage anyone planning on using any header to get more information and
to examine their own mail archives to see whether such patters are unique
to the mail they wish to filter.  This process should be a given before
taking other peoples recommendations and asserting them on one's own mail -
there are too many variables in play to assume that what works for one
person will work for another.

I'm still interested in seeing any documented references to X-UIDL, should
anyone come across them.

<Prev in Thread] Current Thread [Next in Thread>
  • Re: HELP: What headers to watch for spam (was Re: anybody see this spam?), Professional Software Engineering - Lists account <=