At 16:44 2004-05-17 -0400, fleet(_at_)teachout(_dot_)org wrote:
I'm seeing spam messages that appears to be from one individual (or
perhaps one software) that have a specific header format as:
Received:
Received:
Received:
Message-Id:
Received:
Is it possible to construct a scoring recipe that would identify messages
with this header format? Is a message that seems to have a Received: out
of place indicative of spam, or badly formed, or not abnormal at all?
You could use scoring if you wanted. Scoring is a means to an end -
matching the above as-is certainly doesn't mandate using scoring to achieve it.
I've used the following recipe for a long time:
# Received headers AFTER other headers...
:0
* ADVISORIES ?? on
* ^(From|Date|Subject|Reply-To):(.*$)+Received:
{
# some lists though...
SPAMVAL="+25"
SPAMMISHNESS="${SPAMMISHNESS}${SPAMVAL}"
SPAMNOTES="${SPAMNOTES}SPAM: ${SPAMVAL} Advisory - Interspersed
Received: headers${NL}"
}
I should warn you that while this used to be pretty consistently spam, some
mailing lists, which insert headers, have a tendancy to trip it. In my
case, I'm only adding about 10% of my threshold for it, so it doesn't
result in false-positives unless there are a number of more significant
factors already.
Looking back about three weeks, I see a couple dozen list trips, and only
about three actual spams (all of which were scored as seriously spammy even
without this).
Note my recipe above (which morphed from a discussion on this list two or
three years ago or more), does not use Message-Id. In fact, of all the
headers to include, that would be a bad one: too many systems (or client
hosts) emit messages without their own messageid, necessitating the
receiving or intermediate) host to insert their own to make the message
RFC-compliant. Now, while I check for non-users sending messages to MY
mail host for which my mailhost must insert a message-ID, I know better
than to flag some message submitted by a user to their own ISP mailhost as
spammy just because that users host didn't include a message-ID and their
ISP had to insert one.
There's no RFC which declares that Received headers must appear before others.
Some mail lists strip Received: headers from before submission to the list
(either for privacy or to reduce retransmission overhead - 1KB times of
extra cruft times 10,000 subscribers adds up - just look at the headers on
the procmail list messages!), which causes issues with some filters -- I've
got filters that check to see if hosts associated with a sender's freemail
domain (hotmail, yahoo, etc) appear in the Received: chain, but when
someone uses one of these services to mail a list which strips headers, the
condition gets tripped (to combat this, and some other recipes which are
more prone to tripping on lists, I grant lists an "allowance" for spam
scoring).
---
Sean B. Straw / Professional Software Engineering
Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
Please DO NOT carbon me on list replies. I'll get my copy from the list.
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail