[Asrg] Textbook example

Here's sorta a good textbook example of relatively common spammer tricks.  
Let's 
consider it:

[quote]

Received: from c000.snv.cp.net 
(h026.c000.snv.cp.net[209.228.32.61](misconfigured sender)) by 
sccrmxc14.attbi.com (sccrmxc14) with SMTP id <20030624033452s1400or14be>; Tue, 
24 Jun 2003 03:34:52 +0000
Received: (cpmta 23882 invoked from network); 23 Jun 2003 20:34:51 -0700
Delivered-To: terabites(_dot_)com%gep2(_at_)terabites(_dot_)com
Received: (cpmta 23867 invoked from network); 23 Jun 2003 20:34:50 -0700
Received: from 200.78.44.3 (HELO louisiana.co.jp) by smtp.c000.snv.cp.net 
(209.228.32.61) with SMTP; 23 Jun 2003 20:34:50 -0700
X-Received: 24 Jun 2003 03:34:50 GMT
Reply-To: <seeds45(_at_)louisiana(_dot_)co(_dot_)jp>
Message-ID: 
<157378001402$505f787c(_dot_)seeds45(_at_)louisiana(_dot_)co(_dot_)jp>
From: Sabastian Taylor <seeds45(_at_)louisiana(_dot_)co(_dot_)jp>
To: <gep2(_at_)terabites(_dot_)com>
Subject: nvvlxo lucrative increase Ebay 5\xE9min\xE3r - Fre\xB7e _to Y(_at_)ou
Date: Mon, 23 Jun 2003 20:35:08 +1000
MIME-Version: 1.0
Content-Type: text/html; "charset=iso-8859-1"
Content-Transfer-Encoding: 7BIT
X-Priority: 3
X-MSMail-Priority: Normal

chanty MycXY42jUmagernJQbgbHxg%3D%3D unselfishly o-eVm

{a href="http://hgabriela(_at_)210(_dot_)15(_dot_)187(_dot_)49/dtomi/">{img 
src="http://hjones4(_at_)210(_dot_)15(_dot_)187(_dot_)49/dtomi/pr4.gif">{/a>

Y-6i testify E7Cn4ZEzJHbbHXGlKoTWUUzID plunder

[end quote]

Let's presume for the moment that seeds45(_at_)louisiana(_dot_)co(_dot_)jp is 
in fact a 
legitimate (probably disposable) E-mail address.  It doesn't appear, though, 
that the spammer obviously abused an open relay or anything of the sort.

[Open SMTP mail server?  That's a possibility, I suppose.]

Clearly the spammer is trying to employ a variety of strategies to get past 
filters (although why they'd seriously expect a GOOD result by bludgeoning 
their 
way into someone's Inbox who had set up mechanisms to try to keep them out is a 
curiosity...).

They've included a lot of random text in the message (hard to say if the text 
is 
different in different copies, to bypass hash code blockers, but I'd suspect 
so).

One strategy I could use in my mail filtering system would be to trigger on 
words containing tricked-up letters or odd embedded delimiters (using 5 instead 
of S, for example, or using odd accented letters in english words that don't 
use 
them, or weird embedded delimiters whose sole purpose is to confuse keyword 
scanners).  It probably isn't too hard to develop SPITBOL patterns which will 
match such tricks, and probably with "tolerable" efficiency (given today's fast 
processors...)  

Another filtering method might be based on the message having a high ratio of 
non-word content (some other spams I've seen though seem to use sections of old 
term papers or something, just to give meaningless "academic content" filters a 
false "okay" decision).  

Another indicator is the use of non-domain-based URLs... legitimate E-mails 
rarely need to use those.  Yet another indicator is the use of the @ in the URL 
in hopes of tricking some filters... again, this isn't usually found in 
legitimate links.

I'd presume that Spam Assassin (which I presently don't use here) would 
identify 
many of these "odd" constructs and end up tossing a message like this one.  
Perhaps someone else here has more actual direct experience with it and can 
comment on that.

Meanwhile, I still think that the most notable (and common) trick being 
employed 
to try to get around content scanners is by putting the main 'content' either 
in 
the GIF file (which so far I've not successfully managed to look at) or else in 
the Web page linked to it.  While I suppose it would be possible for the filter 
to actually go and retrieve the linked page(s) or something, and content-scan 
those (particularly if the message itself doesn't seem to have much in the way 
of meaningful or revealing content of its own), this seems like it's really a 
lot of effort and once you get into that whole area there are all manner of 
other tricks possible so things can escalate pretty quickly.  Probably well 
past 
the point of diminishing returns.

Anyhow, I think this message demonstrates pretty clearly why I'm of the opinion 
that (rather than spf-like approaches, which probably would NOT catch this 
spam, 
since chances are at least fair that the E-mail address is accurate and not 
forged) and when traditional content scanners aren't being given a whole heck 
of 
a lot to work with here... the simple approach of filtering such HTML-burdened 
E-mail coming from someone you never knew and authorized before is a pretty 
reasonable strategy, and it would drop out this E-mail and most imaginable 
(present and future) variants of it.

The fact that clueless folks who send HTML-burdened mails [because it's simply 
the default for them] might be somewhat inconvenienced by this, should the 
permissions-based approach become widespread, but they're after all being the 
thoughtless ones and not caring about wasting the recipient's inbox space, so a 
little inconvenience in exchange for reducing spam volume might not be an 
unreasonable tradeoff.

Now, it IS true I suppose that the spammer could have used a non-hyperlinked 
URL 
and asked the recipient to copy-and-paste it, but they're still probably going 
to need a lure (which then would have to be plain text to evade the HTML-based 
permissions test) and thus that lure text would be available to the content 
scanner, which whatever's in the GIF file referenced (for all intents an 
purposes) really is not.

Anyhow, I'd be interested in the thoughts of the others here regarding this 
particular example.

[And also, if any of the rest of y'all have received this spam on your own 
systems, you might be interested to compare the copy I got with the copy you 
received, to see what if any changes in my copy have resulted from my incoming 
mail processing that I use here....]

And another perhaps interesting question regarding stuff like this.  When an 
E-mail IS suspected of being spam (and strongly enough that is is picked out of 
the delivery queue)... which is better?  To just t-can it?  Or should it bounce 
back to the sender (with suitable ricochet protection)?  Should it bounce with 
a 
human-sensible explanation about why it was filtered etc etc?

Is it better to tell spammers (effectively) that the E-mail address they mailed 
to is good, and what they need to do to get through your filters?  Is it better 
to just black-hole the message?  Is it better (as some spam filters do) to 
return a bogus "no such user" message in hopes that the spammers will purge the 
"no longer good" address from their databases?

Is this one of those things that the recipient should be able to specify?  
Perhaps using the same permissions-list mechanism?  (Example:  if the mail is 
from someone on your permissions list, but they weren't given permission to 
send 
you mail in that format or with attachments, maybe then you should give them 
the 
benefit of the doubt and tell them that you haven't (yet) given them permission 
to send you stuff like that... whereas if it's from a complete stranger, maybe 
instead one should just t-can it...?)

Is there a more clever solution called for, perhaps, based on "how legitimate" 
the message arguably looks like it might be?  And if so, what criteria might be 
employed?

There are a lot of design decisions that can be possible here, and a 
vanishingly 
small likelihood I suppose that everyone will ever concur on any one of them 
(and why, I suspect, there's little point in setting up a shooting gallery here 
so individuals can take potshots at any specific individual implementation).  
More likely that such differences will reflect in service differentiation 
between different ISPs incorporating such permissions-based filtering schemes.

Lots of questions.  At least it shows why I *really* like the simple, direct 
approach resulting from simply filtering out (unexpected) HTML-burdened 
messages 
(or attachments) if they're not from someone you've granted permission to send 
you stuff like that.  It's a SIMPLE approach, would take out a LOT of spam 
REALLY fast, and would make it a lot easier to deal with the remainder.

And giving different ISPs (or them passing on to their customers perhaps) the 
choice of what to do about the messages not getting through the filter.

Gordon Peterson                  http://personal.terabites.com/
1977-2002  Twenty-fifth anniversary year of Local Area Networking!
Support the Anti-SPAM Amendment!  Join at http://www.cauce.org/
12/19/98: Partisan Republicans scornfully ignore the voters they "represent".
12/09/00: the date the Republican Party took down democracy in America.



_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg