We don't have any way to day to be sure that nobody send
directly mails to the archiver adress so the archive can contain mail that
was not distributed by the list. Any solution to this ?
I created a user account on my web server for the archives. Each
list that I archive includes this user account (well, an alias
pointing to this user account). The user account has a .procmailrc
file that
a) archives all incoming mail
b) performs filtering (i.e. remove free hotmail sigs, check
for X-no-archive: Yes headers, etc.)
c) pumps the message into a perl script
The perl script then loads all of the headers into an associative
array and checks the headers against a file that looks like this:
finder-develop internal From owner-finder-develop
perennials external From owner-perennials
woodyplants external From owner-woodyplants
Column 1: name of the directory to put it into
Column 2: the prefix for the directory mapped into another file (so
external = /mnt/WWW/mallorn/htdocs/lists,
internal = /mnt/WWW/internal/htdocs/lists, etc.)
Column 3: header to match
Column 4: the string to search for in the header specified in column
three
By doing this all I have to do to archive a new list is add an entry
to this file and subscribe the address to the mailing list. Messages
that don't match any of the headers in this file get passed on
directly to me (which has only happened twice -- once by programming
error, and once because someone was trying to be clever).
So, the gist of it is that I would check all messages that come
into a given alias and make sure that the headers say that they're
indeed who they claim to be. More checks could be introduced such
as checking the Received: headers to make sure that they pass through
the mailing list server, etc. Of course, a tenaciously malicious
individual could always push their messages onto your archives
until email authentication becomes a standard, but these simple
checks should be adequate for most cases.
Chris