Re: [Asrg] Re: Asrg Digest, DNSBL BCP v.2.0

I think you can do FAR BETTER from a content standpoint

(content analysis, such as Spam Assassin, following "a
priori" blocking of mail from unknown/untrusted senders
containing HTML or attachments) than you can using any
kind of IP-based blacklisting or other "reputation"
scheme.

SpamAssassin (and other content filters) don't actually work the way you think 
they do, on many levels.

The major measurable component of spam is whether or not the sender

has permission to contact the recipient.

I disagree. I have no objection at all to being contactedby someone I've never met before. I hand out businesscards at trade shows and elsewhere. I have my E-mailaddress on my (well-indexed) personal Web site. Manycompanies put their E-mail address on Yellow Pages ads andother public places. The fact that I've not previouslyauthorized contact isn't the problem. The problem is thedelivery of unwanted, highly repetitive and annoying,scams and garbage.

Content filters in particular

have no view into consent; no way to measure consent.

In the absence of a fine-grained whitelist (on aper-sender basis), I agree that they don't have any way tomeasure consent. But adding that component changes thatsituation quite dramatically. Not only does the systemthen know WHO has established "consent", but also WHAT hasbeen "consented" to.

The trick then is deciding which of the NEW, first-timecontacts is likely to be unwanted. Certainly, there arevarious clues... including the presence of contentcommonly used to evade filtering (decryption scripting,obscured URLs, URL redirection, etc etc).

There is no HTML

code or X-Header that reliably provides proof of opt-in.

If there were, and if it could be relatively easilyspoofed, it would be less than terribly useful.

I think the solution involves (among other components) atacit understanding at both ends of the communicationbetween what the sender is sending, and what the recipientexpects them to send. The bar should be higher forpreviously unknown, therefore unestablished or untrusted,senders.

They do some

good things based on modeling of what looks like spam; butit's also

true that things that look like spam are not always spam.

That's true. And that's where the recognition of afamiliar (to the recipient) sender enters into things.

False positive issues you rant about occur just as often with content filters. 
Some would claim, even more so!

I am likely to be far less upset about gettingquestionable mail if there is at least SOME arguablereason why the filter ought to have delivered it. Usersought to be able to tweak their filters so that they canchange the rules whenever they desire, especially forparticular cases that occur with some frequency for them.

With a blacklisting, I get a bounce back and can find somebody to argue with. 
With the common method of implementing a content filter,

my mail is quietly eaten and I get no information backregarding thefailure to deliver the mail to end recipient. This isworse than IPblacklisting; less transparent; less obvious; lessopportunity for

feedback and investigative recourse.

The big problem with blacklisting bouncebacks is that inthe general case, you cannot be sure WHO to send thebounceback TO. Once spam has gone through one or morelevels of forwarding, the only way to go further back isvia the Received: headers, but those are commonlycounterfeit. Sending bouncebacks multiplies the wastedbandwidth due to spam.

Worse, "intentional bounceback" can be used by spammer asone way to get their spam delivered to a third party...they send mail in a way that they are confident will bebounced back, but arrange things so that the bouncebackwill go to the actually intended recipient... but thistime, the (bounceback) message is originating from anot-blacklisted MTA.

Ultimately, I believe that the best way to deal with suchspam is to at least OFFER recipients a chance to reviewblocked messages (and hopefully via rules that they canuse to eliminate the necessity of their reviewingrepetitively familiar spam), or the choice of acceptingthe system's determination and just junking it.

But again, I'm far more willing to accept mail fromsomeone if (1) I recognize the name of the sender, and (2)the mail "looks like" the sort of mail I would expect toreceive from that sender.

The fact that you think they're better is likely based on an incomplete view on your part.


I doubt it, but I'm certainly willing to learn.

You actually probably have no idea how much of your mail has ever been 
redirected to a bulk or trash folder

by a content filter.

Actually, I tend to monitor that rather closely, in partbecause I use that knowledge to refine my ruleset.

And of course, not to mention that SpamAssassin, which you hold up as

the better model,

I consider it a 'respectable' example of the genre. Igenerally make it a point to include "like" or "-type" inreferences to that product.

has lovingly crafted hooks into it to allow direct

support of IP-based blacklist and other IP-basedreputation

mechanisms.

Hopefully they use that as an INPUT into the ratingprocess; I don't have a problem with that, as long asmail coming from such "blacklisted" IP addresses is notBLINDLY trashed regardless of any other considerations.

Note to rest of world: I'm not anti-SpamAssassin. I've run it myself

before and likely will again. I'm just pointing out thatlike justabout every other kind of spam filtering or blockingmechanism, a

content filter is imperfect.

Certainly they have limitations, including some which areso severe as to be essentially crippling. HTML, embeddedimages, attachments, and the like make it nearlyimpossible for content-based spam filters to do a good andeffective job. Even if, (IF!) for example, a contentfilter had OCR abilities to try to analyzetext-as-image... an embedded image could change thereferenced image (say) an hour after sending the E-mail,such that it was actually read AFTER the analysis hadpassed the (previously linked) image.

It's a bit mind-blowing to see content

filtering held up as this panacea to address the ills ofIP-basedblocking, since they're both approximate models of whatsomebody

thinks is spam,

The only opinion that MATTERS is that of the recipient...which is why they should be able to control the rulesetand the sender-by-sender whitelist, as well as what to dowith spam (e.g. putting it into a spam folder that theycan examine as they wish to confirm the accuracy of thefiltering).

...and have flaws inherent to both technology and policy

limitations.

Again, what the USERS want is the ability to have the mailthat makes it into their inboxes bear some approximationto the mail they expect and want to see. And only thoseusers are able to make that judgement call, in the endanalysis. What we need is an effective, practical tool(or toolset) to allow them to express that set ofcriteria.

It's clearly not enough to look just at the e-mailheaders; but within that limitation (for example) I wasgetting relatively useful filtering using the web-basedruleset offered by my domain provider, until I ran intotheir limit of 200 rules...!

Regards,

Al Iverson

Gordon Peterson
http://personal.terabites.com

1977-2007 Thirty year anniversary of local areanetworking


_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg