At 13:17 2006-11-28 -1000, Michael J Wise wrote:
Here's a hint:
You know, you can always talk specifics. If the writers of these things
really followed this forum in the first place, then their crap wouldn't be
so trivial to identify.
What does a typical Message-Id: header as generated by Microsoft
Outlook actually look like?
Generally speaking, I don't like to try to track what some given MUA's
formatting is, least of all M$ crap. There's a general "is it formatted
within norms" test here already.
Compare and contrast that with the one that is typically used by these
bozos.
Well, across _many_ (but certainly not _all_) of the messages I see hitting
one account, the last sequence of digits in the messageid is constant:
$6c822ecf
however, this may have more to do with how the spew software in question
generates messageid strings - every one of those messages has "debora" or
"deborah" as the first username token. There are other submissions with
"thebat.net" as the host token of the messageid, and the first portion of
the messageid is markedly different from the ones with the common token -
not just the content, but the formatting. For example:
Message-ID: <186111616(_dot_)05945262530648(_at_)thebat(_dot_)net>
Message-ID: <01c71181$1e5d8fd0$6c822ecf(_at_)deborahmomslilangel1>
In running my filter against mail directed to another host, I get similar
results, though I note that the token associated with deborah messages does
appear on messages where the hostname token doesn't contain deborah:
From: "Michel Mccoy" <ydumwvnlaqqf(_at_)bouncycastlesales(_dot_)com>
Message-ID: <01c70de7$8d9c0f90$6c822ecf(_at_)ydumwvnlaqqf>
In any event, the recipe below is catching the variants on this. My rule
of thumb is to avoid looking for hard-coded strings and to avoid body
searches, and this recipe does neither.
Where the username token is duplicated, the messageid string is completely
identical (though so has the target email account across those dupe
sequences I've checked - though it isn't the same across the entire sample
set). My guess is that the messageid is a hash of the From: address (and
possibly the to:).
Matching for a specific value isn't my game, and I don't much feel like
maintaining a cache to compare with either.
Note that as posted, my quickly drawn up recipe lacked anchor carets at the
start of Subject: and From:, and should be more like so:
# 20061127 username_wrote
# subject consists of "username wrote:" where the username token is the
# first word of the quoted name text in the From: field.
:0
* ^Subject:[ ]*\/[^ ][a-z]* wrote:$
* MATCH ?? ()\/[^ ][a-z]*
* $ ^From:[ ]\"$MATCH
{
SPAMVAL="+75"
SPAMMISHNESS="${SPAMMISHNESS}${SPAMVAL}"
SPAMNOTES="${SPAMNOTES}SPAM: ${SPAMVAL} username wrote${NL}"
}
Indicators on almost every message:
SPAM: +50 message-id domain does not match sender domain
SPAM: +50 message-id domain not in received chain
SPAM: +150 No rDNS for host passing message to our MX
SPAM: +175 IP 88.241.203.7 listed in dialup DNSBL
SPAM: +125 relay hostname appears to be consumer dialup/broadband
Some of the less common indicators, but still appearing on many:
SPAM: +50 Cleartext recipient is common target here
SPAM: +50 Envelope sender is a two letter TLD
SPAM: +35 from_domain not found in received chain
SPAM: +100 Date is suspicious at 91290 seconds {001 01:21:30} AFTER reception
SPAM: +5 spam type statements (5)
SPAM: +45 MIME - multipart/related
SPAM: +300 Foreign character set encoding (iso-8859-2) in body.
SPAM: +(249/2) Match against spam domain list [futurequest.net]
I certainly appreciate that the bogus Received: header uses nonstandard
formatting for the SMTP id tokens. I appreciate that the FQDN of my
mailhost differs from the mail domains which it processes mail for.
---
Sean B. Straw / Professional Software Engineering
Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
Please DO NOT carbon me on list replies. I'll get my copy from the list.
____________________________________________________________
procmail mailing list Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail