procmail
[Top] [All Lists]

A (possibly) novel way of dealing with spam/UBE

1998-02-25 06:45:39
I'm sure spam filtering is not a new topic to this group, but the method
I'm currently using might be.  I'll breifly state the idea below along
with some issues I have yet to solve.  Your comments would be appreciated.


So far all the spam filters I've seen work on the principle of allowing
mail through by default, but excluding mail which meets certain criteria
(from specific address or pattern, containing certain content, etc).

The problem there is that it is difficult if not impossible to teach a
computer to accurately decide if a mail is spam or not.  Generally you
just catch a small percentage of them with some rules, and then constantly
chase after the overspill with a "kill" list, always one step behind the
spammer.  Also, you might exclude mail which is really not spam.


I'm in the process of implementing a scheme which:

  - Accepts mail if from a recognized source

  - Bounces all other mail back, with a rejection letter attached
    explaining how to become a "recognized" source (basically resend
    the message with a special keyword in the subject).  Optionally
    logs the action for future reference.

This filters out spam because the spammer is either not going to
bother registering, or will not get the rejection letter in the first
place due to forged headers in the spam message (rejection letters
which bounce back to you are discarded by the filter).

The beauty of this method is that the filter becomes very accurate
in identifying spam because it is a human making the decision -- the
sender themselves!  Any mail which makes it past the filter is
gauranteed to be from a real person who is giving you individual
attention -- the opposite of spam by definition.

The new sender will have to take this extra step only once, since
upon doing so their address gets stored in your "recognized" list,
allowing subsequent mail from them to be accepted as is.  The
registration process is painless (litte more than hitting the "reply"
button), and most new senders won't mind because, after all, they
are initiating the contact in the first place.

And when the new sender is replying to a mail from you, they may not
have to register at all, assuming you have pre-registered them (or
their entire domain) ahead of time, anticipating a reply.  You can
do this manually, or even use an "outgoing filter" to do this
automatically for any mail you send out.

So far I've got this working well using about 3 recipes and a few extra
files (the address registry list and a few form letters).  The only thing
I haven't found a good solution for is how to deal with mail you receive
from mailing lists.  That is basically an exception to the rule because
with most lists the mail will appear to come from the original authors,
not the list itself, yet you don't want to keep sending rejection letters
to everyone who posts to the list you are subscribed to (not good, for
many reasons).

It comes down to a problem of detecting a message as being from a list,
and bypassing the spam filter.  The trick is how to do this consistently
and in such a way that you don't have to have a separate special-case
recipe for each list you are subscribed to (or join in the future).  I'm
currently thinking about using the "From " line (no colon) or possibly
even the Received: headers (ugh) to make this determination.


Any suggestions would be appreciated, either about the mailing list
identification problem or about the general concept of blocking spam
through this kind of "self-registering inclusion filter" (SIFT).

Thanks,
Paul Smith

<Prev in Thread] Current Thread [Next in Thread>