Re: Use scoring to determine header format?

At 16:44 2004-05-17 -0400, fleet(_at_)teachout(_dot_)org wrote:

I'm seeing spam messages that appears to be from one individual (or
perhaps one software) that have a specific header format as:

Received:
Received:
Received:
Message-Id:
Received:

Is it possible to construct a scoring recipe that would identify messages
with this header format?  Is a message that seems to have a Received: out
of place indicative of spam, or badly formed, or not abnormal at all?

You could use scoring if you wanted. Scoring is a means to an end -matching the above as-is certainly doesn't mandate using scoring to achieve it.


I've used the following recipe for a long time:

# Received headers AFTER other headers...
:0
* ADVISORIES ?? on
* ^(From|Date|Subject|Reply-To):(.*$)+Received:
{
        # some lists though...
        SPAMVAL="+25"
        SPAMMISHNESS="${SPAMMISHNESS}${SPAMVAL}"

SPAMNOTES="${SPAMNOTES}SPAM: ${SPAMVAL} Advisory - InterspersedReceived: headers${NL}"

I should warn you that while this used to be pretty consistently spam, somemailing lists, which insert headers, have a tendancy to trip it. In mycase, I'm only adding about 10% of my threshold for it, so it doesn'tresult in false-positives unless there are a number of more significantfactors already.

Looking back about three weeks, I see a couple dozen list trips, and onlyabout three actual spams (all of which were scored as seriously spammy evenwithout this).

Note my recipe above (which morphed from a discussion on this list two orthree years ago or more), does not use Message-Id. In fact, of all theheaders to include, that would be a bad one: too many systems (or clienthosts) emit messages without their own messageid, necessitating thereceiving or intermediate) host to insert their own to make the messageRFC-compliant. Now, while I check for non-users sending messages to MYmail host for which my mailhost must insert a message-ID, I know betterthan to flag some message submitted by a user to their own ISP mailhost asspammy just because that users host didn't include a message-ID and theirISP had to insert one.


There's no RFC which declares that Received headers must appear before others.

Some mail lists strip Received: headers from before submission to the list(either for privacy or to reduce retransmission overhead - 1KB times ofextra cruft times 10,000 subscribers adds up - just look at the headers onthe procmail list messages!), which causes issues with some filters -- I'vegot filters that check to see if hosts associated with a sender's freemaildomain (hotmail, yahoo, etc) appear in the Received: chain, but whensomeone uses one of these services to mail a list which strips headers, thecondition gets tripped (to combat this, and some other recipes which aremore prone to tripping on lists, I grant lists an "allowance" for spamscoring).


---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail