Looking for beta testers of experimental anti-spam setup + description

Hello,

        I know my setup is not compatible with the proposed sieve measures  
language etc., but I am looking for people who are interested in testing the  
setup I have been using lately which has been able to stop approximately  
all spam. Even if the final anti-spam measures on the internet will probably  
differ from my setup, this system does work for me and the experiment is  
interesting enough.

        The setup basically consists of a deny-unless strategy, based on  
the sender's address with regex filtering as a possibility (e.g. to prevent  
mailing list messages to end up in the deny-unless part). I am planning to  
distribute this software as freeware.
        
        The software is written with performance as an important aspect. It  
must be usable for systems having many users and a heavy mail load. As a  
side effect, it can be used as distributed local mailer.
        
        I am looking for a *few* volunteers to test this system. I have  
been using this for the last three months with great success. The only part  
that remains untested is the 'many-users part', as I am currently the only  
user).
        
        I include the README file:


WHAT TO DO WITH THIS FILE

It is not called README for nothing. Read it at least once.


WARNING

GORD is experimental, there are no pretty interfaces for users. But any
smart ISP could develop one for his clients and use GORD as a basis to
offer this anti-spam service. For me, it just works. I don't see any
spam anymore.


WHAT DO I NEED TO RUN GORD?

Complete perl 5.00404 (preferably with System V IPC) and the following
non-standard modules:
        Data::Dumper
        MLDBM
        GDBM
        libinet
        Mailtools
System admnistrator privileges to install.

If you can compile C-code, you can install a fast client, instead of the
perl client which has to be compiled every time it is run.


HOW GORD WORKS

GORD is a local server program that acts as a local mailer. Anyone
mailing you for the first time will be denied. GORD sends that sender a
message. If that person answers within a time limit set by you, he or
she will be added to your accept list. As spammers normally do not have
vaild e-mail addresses (as they would be found out) this system works
(but: see below for some attacks and possible future changes).

The (Resent-)From: header generated by the sender is essential. There
are some exceptions on this, and they can be handled with the help of
recipes (see below).

The basic system works like this:

1. Someone sends you a message
2. GORD detects this person is unknown to you
3. GORD sends the original sender a message
4. The original sender replies
5. GORD receives the reply and delivers the original message

6. That same sender sends you another message
7. GORD detects that this sender is acceptable and delivers the message

A spammer cannot misuse this basic system. First: if the spammer uses a
valid from or reply-to address, he will be swamped by complaints (which
is why spammers use stealth mailing). Second, the spammer cannot use a

From address that is acceptable to all people he sends to, and it is

impossible to find out who is acceptable for any single user. And the
unique id's cannot be forged.  GORD silently drops anything that tries
to fake id's. And by not passing bodies in messages it generates itself,
a spammer cannot use the GORD of one system to pass messages to someone
else.

Users may tell it what filtering recipes to use (if any) and its core is
a default behaviour that is based on a 'deny-unless' strategy based on
the sender's address.  For mailing lists etc there are the recipes.

You can be 'not strict' by using a recipe that accepts your proper email
address in the To: header as most spammers do not generate individual
addresses, but send one message with one generic header to millions of
addresses. See below.

GORD uses some special headers and a special header format, which has to
be tuned for every system running GORD. These headers contain unique
id's generated by GORD and a system identification (best is to use the
FQDN). See below.


HOW FAST IS GORD?

Pretty fast, I think. Once the perl source is compiled, you are working
with a very fast system (fast regex for instance). This is why GORD is a
server, you don't have to start it for every message. The only thing
started for every message is a very small client (which will probably be
an optimized C program) and even could be built into the MTA. The server
itself only forks, and that is a pretty fast operation.

The client and the server use the IANA assigned port 312, they talk a
very simple protocol which for the time has been named VSLMP (Very
Simple Local Mailer Protocol).


HOW SAFE IS GORD?

Pretty safe. First, the perl client will act as local mailer himself if
it cannot find a GORD server (which might have crashed or been killed)
and you have started it with the --safe flag (perl client only).
Besides, the exit value is almost always EX_TEMPFAIL, telling sendmail
to keep the message and not generate a delivery failure.

Secondly, the recipes and such are executed as the recipient user. So,
whatever that user can do (but nothing more) can be done by GORD on
his/her behalf.

Thirdly, GORD is written in PERL with tainted on.

Fourthly: you can limit access to GORD with a hosts file in the library
directory. This file contains one entry per line. A file with the entry
'localhost' or '127.0.0.1' (without the quotes) will limit access to
clients started locally.


HOW GORD SHOULD BE INSTALLED

The GORD client should be your local mailer. My sendmail.cf contains as
I am writing this (GORD is still part of my developent setup and is thus
located in /usr/local/src).:

Mlocal, P=/usr/local/src/gord/gordclient.pl, F=lsSDFMhPfn, S=10, R=20,
        A=gordclient --verbose=2 $u --safe=/bin/mail_-d_USER
 or

Mlocal, P=/usr/local/src/gord/cclient/gordmail, F=lsSDFMhPfn, S=10, R=20,
        A=gordclient -v -S $u

There needs to be a directory for the system-wide info. Mine is
/usr/local/lib/gord. This directory is set in the file gord.ph.

Then, take the drop.txt from the examples subdirectory and copy it to
the library directory. Edit it. GORD replaces the strings DROPID, USER,
SUBJECT and FROM with appropriate replacements.

Make sure the right files are setuid root (run 'make setuid'). If you do
not know what this means, you should not be installing GORD. The setuid
files handle global settings for the daemon, like, who wants to use GORD
and what the global recipes are. The only files that are setuid, are the
files that may write and read the recipe database and the active
database. This database is owned by root and users may start programs
that update their recipes and change their status wrt their use of the
system.


RUNNING THE GORD SERVER

This is what I added to my /etc/rc file, just before sendmail is
started:

(grep '^Mlocal.*gord' /etc/sendmail/sendmail.cf >/dev/null) && \
        (echo -n 'gord'; /usr/local/src/gord/gord.pl --verbose=2

/dev/console 2>&1 &)

sleep 10;

This starts GORD if in my sendmail config gord is the local mailer.


WHAT YOU HAVE TO DO ACTIVATE GORD FOR YOU AS A USER

Create a directory '.gord' (without the quotes) in your home directory.
Here GORD will keep the databases pertaining to you and here it will
read recipes (if any).

Run gord_use.pl with the argument 'yes' (without the quotes). The system
administrator can run gord_use.pl for other users, and use the --user
flag.  From the moment on that GORD is activated, the
drop/accept/dropsilent system will be active for you. You don't need to
have any recipes, probably most users will be able to make do with the
barebone system. (Still to be checked:  permissions and ownership)


FILTERING

The basic GORD system works almost without filtering. But it can do
heavy filtering if needed. There are two types of filters:

1. System-wide filters (mainly used to catch bounces from GORD
requests).  These filters are in the file
/usr/local/lib/gord/recipes.ascii (human readable version) and
/usr/local/lib/gord/recipes.db. The filters in the ascii file are loaded
into the db with the command
        load_recipes.pl --system

2.  Per-user filters (mainly used to prevent mailing list stuff from
reaching the accept/drop phase). These filters are found in every user's
.gord directory in the file ~/.gord/recipes.ascii (human readable
version). They are loaded in the recipe database by running
        load_recipes.pl

The filter recipe consist of three parts:
        Regex matches on the header
        Regex matches on the body
        Actions

All regular expressions of a recipe must match for a recipe if the
actions are to be performed. All actions must be succesful for the
recipe to have succeeded. As soon as a recipe succeeds, the message is
supposed to have been delivered. If there is no action line, the default
is to deliver in the user's mailbox.


THE ACTIONS

The action lines may have the following formats:
        There are three per-action modifiers :
                ::MBOX
                        BSD mailox format
                        Add From header if not available, escape From at
                        the start of a line in the body, add an empty
                        line at the end. This is the default behaviour.
                ::RAW
                        Just dump the message raw.
                ::IGNORE
                        Ignore the succes or failure of this action (it
                        always succeeds).

        There are the following types of action
        >string         write to file called string
        >>string        append to file called string
        |string         pipe to program called string
        string          write to mailbox called string
        ::DROP          Start the drop/request cycle, or deliver if
                        known sender
        ::KILL          Drop silently.
        ::ACCEPT        Accept this sender
        ::FAIL          Force failure of this action and thus the recipe
                        as a whole
A recipe has succeeded if all the actions were succesful.


BOUNCES THAT DO NOT INCLUDE THE HEADER

Some mailer daemons do not return the headers of the message that
failed. As a result, the normal way to catch bounces that are the result
of GORD-dropped messages:

::BEGIN
  ::HEADER
    ::NOCASE ^From:.*(DAEMON|MAILER)@
  ::BODY
    ::NOCASE ^X-Gord-Request:\s*\(\d+\.\d+\)\s*\(RnA\.nl\)
  ::ACTION
    ::KILL
::END

does not work. A good example is the daemon from compuserve. As a
result, a message form such a daemon will not be recognized and end up
in the ::DROP process. This then generates a GORD-drop request message
for that mailer.  There are several obvious ways to handle this. Gord
has system-wide recipes that can handle this. Like:

::BEGIN
  ::HEADER
    ^Subject: Undeliverable Message
    ^Sender: CompuServe Postmaster 
<auto\(_dot_)reply(_at_)compuserve\(_dot_)com>
  ::BODY
    ::NOCASE ^X-Gord-Request:\s*\(\d+\.\d+\)\s*\(RnA\.nl\)
  ::ACTION
    ::KILL
::END


USING A RECIPE THAT ACCEPTS ALL MAIL REALLY ADDRESSED TO YOU

Since spammers normally do not generate a vaild To: field containing
your real email address, you could use a recipe to accept all mail that
is really addressed to you:

::BEGIN
  ::HEADER
    ::NOCASE ^(To|Cc):(_dot_)*your_address(_at_)your_site
  ::BODY
  ::ACTION
::END

This has the advantage of not immediately having to annoy all the people
that send you mail with administrative actions. On the other hand,
spammers will soon discover this potential leak and start sending
individual messages with individual To:  headers. So, you might have to
skip this method in the end.


WHAT TO DO WHEN YOU WANT TO SUBSCRIBE TO A MAILING LIST

Each mailing list you are subscribed to needs a special recipe. That is
because you do not know who will be posting to it and you want to accept
everything posted on the list. What you have to do, is the folllowing
series of actions:

1. Temporarily turn off GORD for you. Run
        gord_use.pl no
2. Subscribe to the mailing list
3. When the first posting arrives, create a new entry in your
recipes.ascii file. In that recipe, recognize the special entry that
defines a message from the list, as in:

::BEGIN
  ::HEADER
    ^Sender: owner-liste(_at_)nexttoyou\(_dot_)de
  ::BODY
  ::ACTION
    |/usr/local/bin/appnmail MailingLists/NEXTTOYOU
::END

In this example, I pipe the message to a program that writes it to
NeXTmail style message boxes, but you could of course just enter a file
there which will then be the mailbox file for that mailing list. Note
that if you are a client of an ISP and read your mail using POP3, using  
multiple mailboxes is somewhat of a no go.


HOW SPAMMERS COULD ATTACK

Spammers can do something about this system by starting to mail to
mailing lists. If this system is succesful, this will put pressure on
mailing lists to become moderated, which is a trend anyway.

Spammers could also fake these message headers. This is especially nasty
if the spammers could request the list of subscribers of a mailing list.
Requesting the list of subscribers is a breach of privacy anyway, so it
should not be possible. Well, internet has to mature anyway ;-)

Spammers could add multiple headers for every known mailing list. For
those, gord could be changed to stop messages with too many equivalent
type headers.

You can decide to install a non-strict version by creating a recipe for
mail that is really directed to you, as descibed above. Most spammers do
not send individual messages.  If they start to do this, this trick to
be friendly will not work anymore.

Spammers could also decide to use a From: address that *is* valid but is
just not their own. The point is: it would not help them as the request
would end up somewhere else. But it could produce a lot of invalid
acceptance requests.  Luckily, when that poor person also has GORD, GORD
will normally just drop the invalid request.


EXAMPLE FILTER SETUPS WITH COMMENTS:

For the system wide filters, I currently use:

====================================================
::USER __GORD__

# Kill bounces from stealth mailings
::BEGIN
  ::HEADER
    ::NOCASE ^From:.*(DAEMON|MAILER)@
  ::BODY
    ::NOCASE ^X-Gord-Request:\s*\(\d+\.\d+\)\s*\(RnA\.nl\)
  ::ACTION
    ::KILL
::END

# Several ISP's generate non-standard bounces. Kill these too.
::BEGIN
  ::HEADER
    ^From: Mail Administrator <Postmaster(_at_)gte\(_dot_)net>
    ^Subject: Mail System Error - Returned Mail
  ::BODY
    ::NOCASE ^X-Gord-Request:\s*\(\d+\.\d+\)\s*\(RnA\.nl\)
  ::ACTION
    ::KILL
::END

::BEGIN
  ::HEADER
    ^Subject: Undeliverable Message
    ^Sender: CompuServe Postmaster 
<auto\(_dot_)reply(_at_)compuserve\(_dot_)com>
  ::BODY
    ::NOCASE ^X-Gord-Request:\s*\(\d+\.\d+\)\s*\(RnA\.nl\)
  ::ACTION
    ::KILL
::END
====================================================

For my personal filters I use (example):

====================================================
#Recipes for user gerben:
#Generated by dump_recipes.pl at Thu Jan  8 21:14:49 1998

::USER gerben

::BEGIN
  ::HEADER
    
^(To:\s*(isoc-moderator(_at_)AWT\(_dot_)nl|isoc-nl-mod(_at_)isoc\(_dot_)nl)|From:\s*\"L-Soft
  
list server at SURFnet)
  ::BODY
  ::ACTION
    |/usr/local/bin/appnmail MailingLists/ISOC-Moderation
::END

::BEGIN
  ::HEADER
    ^Sender:\s*owner-ietf-mta-filters(_at_)imc\(_dot_)org
  ::BODY
  ::ACTION
    |/usr/local/bin/appnmail MailingLists/IETF-MTA-Filters
::END

::BEGIN
  ::HEADER
    ^Sender:\s*drawbridge(-owner)?(_at_)net\(_dot_)tamu\(_dot_)edu
  ::BODY
  ::ACTION
    |/usr/local/bin/appnmail MailingLists/Drawbridge
::END

# some stuff removed to make me a bit less vulnerable, for instance
# I recognize local messages by Message-ID and let them pass.
# Using the message ID to accept local messages and replies is vulnerable
# for use by spammers. In the long run, such recipes must go.

# This last filter I put in for testing, I save the message and then
# do the action which would have been performed anyway.
# This recipe matches any message. This setup drops all spam and all
# denied message into a special mailbox.
# Note again, I have to use appnmail, if your mail program understands
# normal BSD-style mailboxes, you could just give the name of the mailbox

::BEGIN
  ::HEADER
    .
  ::BODY
  ::ACTION
    |/usr/local/bin/appnmail /Users/gerben/Mailboxes/GORD/Dropped
    ::DROP
::END

====================================================


CLOSING REMARKS (KNOWN BUGS ETC)


SYSV IPC

Under a very heavy load, my NEXTSTEP 3.3 systems fails in the networking
code.  This crashes my system. Also, under heavy load, my perl server
program will hang and will only be killable by a SIGKILL. These errors
originate in the OS.  I have added System V IPC handling to limit access
to the server program. Very stable systems may do without the SysVIPC.
Anyway, it works like this:
        The server creates a semaphore with value 4 (configurable)
        The clients use this semaphore to limit concurrent access to the
        server
As a result I can start a 100 background jobs trying to deliver a
message and the server will not die and the system will not crash. Perl
users on systems without SysVIPC may use (after a little adaptation)
this system, it is only to get around bugs on my system.


KNOWN BUGS

Currently, if someone unkown sends you more than one message before
replying to the request, this person only gets one request and only the
first message (the one that triggered the request) is delivered. The
other message is saved, but never delivered. I'll repair this shortly.


MY ADDRESS

(c) 1997, 1998
Gerben_Wierda(_at_)RnA(_dot_)nl

--
Gerben_Wierda(_at_)RnA(_dot_)nl (Gerben Wierda)
"If you don't know where you're going, any road will take you there"
Paraphrased in Alice in Wonderland, originally from the Talmud.

Dass man fuer die Philosophie ein Interesse zeigt, bezeugt noch keine
Bereitschaft zum Denken -- Martin Heidegger