Re: [Asrg] Re: bounces, and anti-spam principles

Before trying to reply to this whole thing, you mightwant to scrollforward to Comment #6. You seem to be assuming variousthings about how
filters have to work that simply aren't the case.


We'll get there.... :-)

You _can_ handle false positives 100% without therecipient controllingthe rules or even necessarily know that a false positivehas occured. Ineffect, reducing your false positive level to zero,without any
recipient involvement.
Our overall system is designed towards _zero_ falsepositives. In atleast one way, we're theoretically (and neareffectively) there. With
DNSBLs and other techniques.

Obviously, I'd have to reserve judgement until I'veexperienced your "zero false positives" idea. Let's justsay that I've got a healthy degree of skepticism.

In fact, in most ways we're far more aggressive thanother filteringschemes. We can afford to have the filters misfire onlegitimate email.
Simply because we've paid as much attention to findingout aboutmisfires and remediating them as we've done withfiltering itself.

Certainly making the consequences of questioning a givenE-mail "acceptable" from a user standpoint gets you a longway.

What happens after being blocked is just as much part ofa well designed
filtering strategy as the filters are.


Agreed.

[comment #1]
In any case, I still contend that simplistic blocking byIP addressor domain name is a very poor approach, and for a wholevariety ofreasons.
I will contend that there cannot be a content filterthat canreliably separate spam from non spam.
It doesn't NEED to be 100.000% accurate.
Nor does any other form of filtering then.

Right. If the USER considers it okay (and that's to somedegree a moving target!) then it's "okay".

The bulk of mail most people receive comes from peoplethey are familiarwith, and which fits certain patterns. A given sender(mailing list etc)will typically have a signature file, for instance. Iknow that AuntMatilda is NOT going to send me an E-mail containing aJavaScriptdecryption routine, or an ActiveX enclosure. She alsois not going tosend me an executable attachment. If stuff like thatarrives here, itis safe to presume it is NOT from her, no matter whatthe From: address
says (and even if it WAS sent from her computer).
This does not work in the large scale beyond a limitedsubset of users.


Why not?

Not everyone has that small a set of correspondents tocope with, and
the "new correspondent" issue remains a big problem.

The total set of correspondents doesn't have to be small.It just needs to have a relatively small number of NEWcorrespondents that NEED to use "advanced"/riskierfeatures and therefore require whitelisting. Ideally,each such correspondent only requires ONE click (one time)for the user to agree to allowing them to use the moreadvanced features. That might be done following aninitial negotiation E-mail where the sender introducesthemself and requests the ability to send more elaboratemails.

Introductory E-mails should NOT automatically presume thedesire or willingness of the recipient to receiveHTML-burdened E-mails.

I guess that depends on what you call "bulk", and howyou propose todetect it. Again, whatever rule you put into effect (ona global-typebasis) is going to be discovered by spammers and theywill engineertheir sending patterns to avoid violating it. That'swhy you need areally narrow and twisty 'gauntlet' they must negotiate,with DIFFERENTRULES for different recipients, where they don't knowand basically cannot figure out what rules they would have to comply withto get a
message through to a particular person.
Having had almost 4 years of intimate operationalexperience with whatis probably the most effective single anti-spamfiltering method there'sbeen (one specific DNSBL), I can assure you that it isboth morereliable in terms of FPs than any content filter I'veever dealt with(we've always run hybrid content+source+othertechniques), I can alsoassure you that its effectiveness is _not_ declining -in fact it's
getting better.

The main problem with content filtering is caused by rusesbased on HTML and attachments. These techniques serve toobscure the content of the E-mail, and ALL BY THEMSELVESthe presence of such content in E-mails (at least inE-mails from unfamiliar senders) can be a priori evidenceof hostile intent, or spamming.

Is it perfect? No. Does adding other filteringtechniques help? Yes.But to claim that it's useless/trivially defeatable/toomany FPs/trivial
to do better turns out to simply not be the case.

Certainly there are always special cases/specialsituations where just about anything can be made to lookgood. That's like SPF, which is claimed to work well (butONLY for the subset of the situations that it works for).

The trick is to stop accepting mail from that IPaddress only untilit has cleaned up.
Again, when you have a LOT of users (and possibly MANYservers) behind aNAT router, denying mail from that IP address results insimply too muchcollateral damage. More to the point, it's a very bluntinstrument forthe job, and it's relatively simple to do very muchbetter.
So far, there's no indication of the latter being true.


I do note you don't refute my objection on principle.  :-)

Show me a
methodology that (a) doesn't abuse innocent thirdparties and/or (b)requires personal twiddling simply not achievable in thelarge scalethat's better than the CBL (or for that matter Zen), Iwant to hear
about it.

I don't consider that ANYBODY has the right to blindlyhaul off and presume that I'm willing to acceptHTML-burdened E-mails from them. So if YOU believe thatsuch a restriction is "abusing innocent third parties"then we already disagree.

I also don't agree that my approach requires anunacceptable level of "personal twiddling" although itcertainly DOES involve some degree of some recipientinvolvement, IF its capabilities are to be optimized. Ifthe software is well-designed, I don't believe that itwill put off most users, especially if they have foughtspam for any length of time before.

Once the spam is gone there is no need to block theaddress unless ithas proven to be a repeat offender without an effectiveprocess for
shutting the spammers down.
What about when the flow of spam is interleaved with allsorts of
good/important traffic as well?
You make a cost-benefit analysis and/or apply othertechniques. Even
content.
Why the insistance on choosing only one? You don't haveto.

Agreed. I never proposed my fine-grained whitelist all byitself. It works best in conjunction with a good contentfilter (and in fact greatly improves the efficacy of thecontent filter).

Effective spam filtering is best done with a hybridsolution. No one
technique is complete on its own.


Agreed.

There is a lot of spam which is obvious. That includesmessages whichcontain links to known-spam-promoted Web sites (at leastin the absenceof contradicting factors, say being from a listdiscussing spam senders!)
SURBLs are blacklists too, and can be equally as bluntas a source IP.Or are you contending that the user should be explicitlyentering them?

I don't really claim either way, but I WILL say that thereare a lot fewer valid links to spam-promoted Web sitesthan there are randomly-generated counterfeit "from"domains and E-mail addresses.

With _individual_ spammers using 1000 or more domainnames to advertise
the same thing, how effective do you think that can be?

Again, it's only one of a variety of criteria that canidentify spam. It's clear to me, though, that just usingFrom: and Subject: lines are by themselves NOWHERE NEARadequate, particularly if you want to receive E-mails frompreviously unfamiliar senders.

It also includes, for example, messages which areidentical to messagesthat some number (dozens? hundreds?) of other recipientsat the same ISP
have already reported as being "spam".
Have you done any work with checksumming/hashing spam,ala DCC or Razor?

Not in that context, although I've used them in othercontexts...

Yes, they're useful. But these days most high volumespammers randomizecontent such that even highly developed de-hashbustingtechniques don'twork very reliably. Doesn't even work on graphical spamanymore.

That's a good reason to simply make graphical spam nolonger viable, by defaulting to not allowing images(whether attached or embedded) from unfamiliar senders.

One would think that ISPs could
locate and perhaps recategorize identical messages(again, perhapstempered by a specific recipient rule) which are stillqueued and have
not yet been delivered to their remaining customers.
Yes, it will yield some results, but the overall resultsare highly
disappointing.

A lot depends on how big the ISP in question is. Within acompany of 15 users, that won't help much. For a big ISPlike Google or Yahoo or Hotmail, flushing all remainingsimilar/identical unread E-mails which have beendiscovered to be spam (by user reporting, as Yahoo does)can quickly purge them from the Inboxes of other users whohaven't yet downloaded or read them.

But let me state again (and this is part of what made merespond,starting this sub-thread) is that it is virtually NEVERa good idea tosend a bounce message after-SMTP-time, because you can'tbe sure whereto send it, and most likely you are just harassinganother innocent
victim.
That's something I'll certainly echo.

Thanks... it is VERY frustrating to me to have Yahoo startblocking all my outgoing Groups mail every month or sojust because SOME virus-infected mail with a bogus returnaddress gets bounced back to them as a "hard bounce".:-((

Hard bounces should NEVER be sent based on content,especially when you can NOT be sure who to send the hardbounce to. And they should NEVER be sent back based ondetection of a virus/worm which is KNOWN to counterfeitreturn addresses!!!

Being able to "slam the phone down" on miscreant IPblocks at theaccept() or helo is much, much, less processing thangoing thru theentire SMTP interaction and whatever it takes to passprocessing off
to an end-user.
It's true that it costs less, but it's also true that itblocks a lot of
innocent and legitimate mail that might be
originating from the same IP address (NAT router?).
This generally doesn't turn out to be a significantissue in practise.

Tell that to the big company that will (eventually) findALL their outgoing E-mails blocked (possibly for weeks)because of one or more infected systems behind their NATrouter. This kind of nonsense could literally put somekinds of companies out of business.

If you spend the time you _should_ be spending toresearch the DNSBLs
(say) you plan on using.

I would not use that, because I consider IP-based blockingto be ultimately a fatally flawed approach.


I don't think it much matters HOW it is done.

There could be
dozens, hundreds, or even thousands of innocent usersaffected.
There could be, if the DNSBL is built in such a way thatit'ssusceptable to that. But it doesn't have to be. Andthe good ones
aren't in any meaningful way.

How can it NOT be, if a given IP address is sending 50%spam and 50% legitimate traffic, all interleaved?


Sorry, I don't agree.

Even 2% legitimate traffic is too much to block, if that2% is critically important stuff.


[incentives to clean up infected machines]

...the problem is that after(first!) SMTPtime, the (intermediary, or final) recipient doesn'treally know whothey ought to notify...! Notifying the wrong person, orsomeone who hasno control over the situation, probably does more harmthan good.
You don't need to notify out-of-band. Presuming thatrecipient systemsuse DNSBLs in an appropriate fashion (inline rejectswith pointers tomore information), legitimate senders find out that it'sblocked andwhy. That's their notification.

Again, I don't consider it acceptable to block good mailbecause someone ELSE using the same server or gateway ismisbehaving.

With decent DNSBLs,that's sufficientto initiate resolution of the problem. It sure is a loteasier thantweaking Bayes or SpamAssassin (we specifically rejectwhitelisting
email addresses because of the forgery problem)

I consider a blanket "whitelist" as nowhere near adequate,because too many addresses are opened up. That's whywhitelisting must be done on a far more finegrained basis.

Also, there's a remark somewhere on the CBL web sitethat impressed mewith its simplicity - something to the effect that "wefully expect thevast majority of infected/blocked users _never_ noticethat they'relisted, because they're using their provider'ssmart-hosts as theyshould". If an ISP wants to proactively scan forinfected IPs to see
about getting them fixed, they can do that too.

You presume that spammers can't send out their spam usingthe legitimate user's "certifications" and "smart hosts"(and their "micropayment account" if pay-per-email wereimplemented). I don't believe that's an accurateassessment. (Even if they didn't today, they can changetheir strategy overnight to do so.)

[comment #4]
I'm skipping this one because it takes too long tocomment on ;-) other
than:
 Executive summary...
- blocking email, because it meets some technicalcriteria, is easier
   on the technical side, but introduces legal problems
It may, perhaps depending on exactly what the technicalcriteria _is_and the rationale for blocking it, but taking risks whenthe business
case/result justifies it, is what people do.

If you NEED to do that, sure. But I consider it a poorbet if there are better approaches which don't leave youexposed like that.

You can forestall a lot, for example, by simply saying"it's our policyto reject email from IPs that appear to be dynamic".That says nothingabout the spammyness of an individual sender, and ifmistaken in
"appearance" simply needs to be fixed.  Or not.

Obviously you can set your policy however you like, butwhen it gets too cavalier, expect angry customers to startbailing.

Again, there is good NO reason to do things that way.(IMHO at least).

- blocking email, because the customer said so, may beharder
   technically, but avoids legal problems
The ISP could equally establish a infrastructure wherethe customer
explicitly delegates filtering decisions to the ISP.

Sure, but don't be surprised when customers decide theydon't like the ISP's decisions, and it wouldn't take muchfor an unhappy sender to team up with an unhappy recipientcustomer and sue the ISP.

And the protections in law for ISPs to be held harmlessfor mistakes ingood-faith filtering go a long way to shoot down anyattempt even wherethe customers haven't formally delegated. It ain't easyfor a sender to
prove bad-faith.  An ISP hasn't been sued in ages.


I wouldn't want to be the first one to break that record.

[snip]

....knowing that all the classical rusesto avoid spamclassification (text as image, embedded links,attachments, scripting,disguised HTML links, etc etc) are a priori deniedthem.... certainly
takes a major bite out of spammers.
It would if you could. How can you tell that an imageis text?

It doesn't matter. You simply block (by default) ALLimages (whether embedded or attached) coming fromunfamiliar/first time senders. If they want to sendimages, they first negotiate that permission with theintended recipient. (And that permission, of course, canbe revoked by the recipient if the privilege is abused).

Blocking spam with embedded links or attachments wouldprobably put us
out of business.  More likely get me fired.

It will certainly put MANY spammers out of business, andput virtually ALL virus/worm authors out of business... atleast as far as E-mail delivery goes.

I don't have a problem with a recipient ALLOWING aparticular sender to send them that kind of stuff... aslong as an introductory E-mail (in plain text) negotiatesthat permission with the recipient FIRST.

And, a given recipient might configure his setup so thathe could look at E-mails that would have been blocked, todecide to allow them in the future... or he can just takea hard line and say "only if approved first!" But in anycase they could be marked as suspect/blockable to warn therecipient, before they open them.

There is NEVER the NEED to send HTML/attachments/scriptingto someone in an INTRODUCTORY E-mail. Not until you'veestablished that they are willing and able to receive thatkind of content from you.

And only allowing executable attachments, HTML, and"big" messages fromknown/trusted senders basically eliminates E-mail as avector for
virus/worm propagation,

"known/trusted" senders by what measure? Explicitlisting of them?

Sure. Most users will not EVER need to approve ANYBODY tosend them executable attachments. ("listing" can beachieved by clicking an "allow this sender to send mecontent like this in the future" button.)

Well, I can tell you about lots of viral attachmentsfrom "known senders".

Absolutely, but MOST users don't need or want to acceptEXECUTABLE attachments from ANYBODY (including "known"senders). That's part of why it is a FINE-GRAINEDwhitelist and not a crude yes-or-no whitelist (which Iagree completely is NOT adequate). I might accept JPGattachments from Aunt Gertrude, but it's a safe bet thatany EXECUTABLE attachment in an E-mail (claiming to be...)coming from her... well, I wouldn't touch THAT with aten-foot pole.

I really don't understand why that's such a difficultconcept for people to grasp. That (VERY!) simple rule canbasically eliminate viruses and worms (again, at least byE-mail) virtually overnight, if adopted net-wide.

Given that, it seems to me to be professionallyIRRESPONSIBLE to NOT implement it. That is the biggest,fastest thing that the IETF could do IMHO to cripple therecruitment of spambot zombie armies.

[comment #5]
That's certainly true, and one advantage of fine-grainedrecipientblocking is that it doesn't require any great worldwideconsensus, nor
any re-engineering of Internet infrastructure.
Nor do DNSBLs ;-)

Again, I consider (IP-based, at least) blacklists to beunacceptably crude. You're locking the door after thehorse has left the barn. A whitelist is better because afirst-time spammer is still left facing a locked door.

What WOULD be helpful, though, would be a recognition bythe IETF that:
a) such fine-grained per-sender by-recipient blocking(and hopefullyaugmented by subsequent content scanning) is aneffective and desirable
approach to the problem, and
As I've been saying, that has yet to be established.

I believe in any case that it is (by FAR) the mostpromising approach. NOT by itself, but in conjunctionwith content analysis and possibly other techniques. Butit is the primary key to make the other techniques moresuccessful.

b) in the general case, blocking of all non-whitelistedE-mailscontaining HTML, scripting (probably covered underHTML... is itpossible to put in scripting without HTML?), orattachments is a "bestpractice". (It is probably a good idea to suggestincluding a maximummessage size, too, as a way of preventing "denial ofservice" attacks bysending big E-mails to someone which would be expectedto fill theirE-mail inbox to overflowing, blocking subsequentlegitimate E-mails).
Obviously, you've not had to deal with the legitimatemail traffic of alarge corporation. I mention those measures as comicrelief atmeetings, because it always produces hystericallaughter. It'd shut us
down.

Again, I don't consider anybody as having a legitimateneed to send HTML or attachments in first-contact E-mails.If they want to send that stuff, then they shouldnegotiate that permission with the intended recipientsfirst. And they should be prepared for some recipients tosay "No, thanks, I'd rather receive plain text."

If it's a question of offending those who "laughhysterically" or shutting down spam, I'll opt for shuttingdown spam.


And I believe that others would agree.

So if you think that such a rule would shut your companydown (given that a user COULD grant you the permission tosend them such mails, after a preliminary meeting andnegotiating that) then I propose that your company'slong-term future is not bright...! I suggest you get themthinking about how your company could adapt to survive insuch a more-responsible world.


[snip]

Again, the problem is the degree of collateral damagethat IP-based
blocking produces.
You haven't demonstrated what that degree _is_. By longexposure, I canassure you that the degree is surprisingly low. If youdo your homework
as you're supposed to.

If you block the mail from a big company's server(s)because one user within that company has an infectedcomputer that's been recruited as a spambot, I don't thinkit's a great consolation that "gee, we hadn't seen thathappen before." Especially if the block thus establishedcauses serious financial harm to said company.

We're receiving 1-2 million emails per day. 80-90% ofthat is spam. Wehave less than 10 FPs per month against Spamhaus'sbl-xbl (which isdoing about 85% of our filtering). We've arrangedthings so that thesender finds out if they're blocked, and there's a wellestablishedprocedure by which they can notify us and we canoverride listings.
If an email is blocked, they contact us, and we forwardtheir email and
fix the listing, is it really a FP?  No.
If every sender correctly interpreted the error messagethey got and
followed through, then there'd be zero FPs.

Some VERY large ISPs (such as Yahoo) interpret a "hardbounce" message due to an infected E-mail as "this E-mailaddress is bouncing" and they disable sending all mail tothe user involved. Stupid? Yes, of course. But YOU tryto convince them to change that policy. They are totallynon-responsive.

Worse, they only provide the now-disabled user with theSINGLE LINE saying the 5xx bounce reason message... sayingwhat virus was found... thus NO clue as to WHO bounced themail back, or where else it might have come from. So, ofcourse, this is totally useless.

I wish our content filters even remotely approachedbeing _that_ good.

If you didn't have to contend with HTML (and the variousruses that allows to obscure content), and attachments,they would be FAR more effective. Not allowing HTML fromunapproved senders would also negate nearly all pfhishingexploits, since you couldn't have misrepresented links toclick on, or buttons that take you to (hidden)disreputable Web sites using (invisibly-)obscured URLs.

[comment #6]

[how users configure their whitelist rules]
The problem being that out of the 60,000 seats here,perhaps less than10 of them are able to competently configure a set ofrules like whatyou have.
That's a software implementation issue, not an inherentproblem in theapproach. I envision a button to click on that simplysays "allowE-mails like this from the same sender in the future"and where thesoftware will open the keyway JUST enough to allow thattype of messageif seen again from that sender. How that recognition isaccomplished,whether by something crude like simple GREP-typescanning, or somethingbrain-damaged like RegEx pattern matching, or somethingstill moresophisticated like the pattern matching SNOBOL/SPITBOLoffers, or even adifferent sort of statistical ranking/rating approachlike contentscanners use... will vary from one implementation toanother. The final
products will probably use a combination of techniques.
I'm sorry, this simply isn't a human interface issue.No amount ortechnique of per-sender whitelisting comes remotelyclose to theaccuracy of our production filters, entirely aside fromthe new
correspondent issue.

It's not a simple "per sender" whitelist. It is afine-grained per-sender PERMISSIONS list, which enablessome types of HTML tags and SOME types of attachments andSOME other characteristics, on a sender-by-sender basis.

I don't believe it is possible to come up with as high anaccuracy on ANY "one-size-fits-all" rule, "production" orotherwise. There are some mails which I would accept fromfamiliar senders, where the SAME mail (same content, samesubject line) would be spam if it came from anybody else.You simply cannot possibly know who I would accept thatmail from, and who I wouldn't, unless I tell you.

You could give our users common filtering software (thereader isalready pretty much standardized) with every filteringknob known toman, and perhaps three of our users could approach theeffectiveness ofthe production systems.

I'm not suggesting that users have to "roll their own"from ground zero. Agreed that 90% of users will have 90%of their senders operating on the defaults. And your"production systems" can of course be tweaked and upgradedas spammers adopt new techniques.

That doesn't in any way demonstrate the lack of value inwhat I am proposing.

I'm only including _me_ in thatlist becauseit's me who built the production systems... Over amillion decision
items are being changed every day in our filters.
Our users simply don't know effective spam filteringtechniques. Theypush the wrong button, twist the wrong knob, and they'reblockingsomething business critical. Or trusting maliciouscontent with forged
credentials.  Or simply trusting...

Agreed that it ought to default to "simple". But thedefaults can still be HIGHLY effective.

There is VERY little excuse for ANYBODY needingtraditional antivirus scanners for incoming E-mail. Thatkind of garbage ought to be, and can easily enough be,squashed outright by responsible ISPs.

My favourite incident was the user who repeatedlyinsisted that heneeded to receive the "important information from theFBI" that wassitting in his quarantine, and the quarantine forwarderrefused to
forward to his mailbox.
Sorry, I said, but as much as you may want to see it,forwarding the
virus is a really bad idea ;-)

Obviously most phishing exploits work by "socialengineering" to convince a user to let them in, or toprovide them information they shouldn't provide. And someusers will always fall for the "Aggie Virus" approaches(like users who forward hoaxes).

I think we can agree, however, that by default deliveringE-mails to clueless recipients that contains executableattachments is a Really Bad Idea too. The trick is todecide how you can implement that restriction, while stillallowing them to be sent in the 0.01% of the cases wherethe sender expects and trusts and WANTS them from a givensender. I believe that my fine-grained whitelist answersthat need, and does it (VERY) well.

Again, I consider IP-based blocking to be inherentlyflawed, to the
point where I consider it a dead-end.
It's a remarkably vigorous dead end ;-)

If the only tool you have is a hammer, you'll try to makeevery problem look like a nail.


Popular does NOT mean that it's good.

[snip]

Yeah, I admit that I usually at least cast a cursoryeyeballing of theYahoo mail "spam" folder too, rather than just emptyingit.Occasionally I -do- find a non-spam message there.(Although thathappens seldom, as I almost never give that E-mailaddress to anybody...It's almost useful as a "personal honeypot" to seewhat's being spammedout, before going to my more usual E-mail accounts andpossiblywondering if that curious E-mail just MIGHT belegitimate).
"Almost useful" is the key. When you have users whosespam load rangesfrom one or 2 per month, to 4000+ per day, you can seethat junk foldershave only limited usefulness, and not to everyone.

Agreed. That's why users need to be able to choose whatlevel of acceptance of the automated filters is mostacceptable to them, and what level of time they arewilling to devote to "questionable" mail.

Nobody can find
legitimate email in a 4000 spams/day feed, now matterhow the filters are implemented.

True enough, but if a person is willing to simply agreethat they don't want to see HTML or attachments fromunfamiliar senders, that instantly blocks a LOT of thegarbage. Simple, and HIGHLY effective.

AND it makes the remaining content-based filtering FARmore effective, by blocking stuff like 'text as image'(whether attached or embedded as a link... or evengenerated by a decrypting script).

And again, they can relieve that restriction on asender-by-sender basis, if they need and want to, based onprevious negotiation with such senders.

Perhaps part of your problem is that you're not seeingthe big pictureof how you can _use_ DNSBLs or any other filteringtechnique.

I've been involved with BBSes and dealing with spam sincethe pre-Internet days, but please feel free to educate meif you think there's something worthwhile you can teachme. :-) I'm not too old a dog to learn new tricks.

Your remarks seem to imply that DNSBLs necessitate nonotificationanywhere, the email just disappears. Or that filteringin general is
that way.

There are clearly all kinds of ways to deal with"questionable" mail. That's something that is going todepend on how the software is implemented, and whatoptions the recipient selects. There is a lot of room inthis area for programmer creativity.

On the contrary, ours have never done that. Indeed,without anybodylooking at quarantines, without anybody personallytwiddling filters,that 4000 spams/day user _does_ get the email that wasaccidentallyblocked. Simply because we do inline rejects withinstructions on what
to do, and problems get fixed fast and without harm.

Some antispam filtering obviously works better thanothers.

IMHO, if you don't have a finegrained permissions list ona per-sender basis, yours COULD work BETTER than it doestoday.


[snip]

The important thing is that the RECIPIENT
controls that, so they can decide the rule thatdetermines what getsblocked and what gets through. That way they don't haveto wonder what
SHOULD have been delivered to them and wasn't.
But what if you arrange things so that the recipientdoesn't have to
control the rules, and still doesn't have to wonder?

I think that a per-sender rule can ALWAYS make thingsbetter and more accurate. That doesn't negate the valueof good pre-sets and defaults.

But again, if by default you let executable attachmentsfrom unfamiliar senders get delivered (or if you blockthem for ALL senders) then your software COULD do better.

Our false positives are handled usually without therecipient even
knowing that something got blocked.


OK, so how do you handle them, specifically?

Spammers have gotten good at throwing enough random junkinto E-mails to
confuse Bayesian filters.
And sender whitelisting and...

Again, don't confuse a simple sender-whitelist with afine-grained per-sender PERMISSIONS list. The differenceis absolutely night and day.

Even if I allow Aunt Matilda to send me JPGS of herpoodle, I don't care how many worms her infected computersends to me with HER from: address and via HER mailserver. My approach will let me see ALL of her real mail,and rapidly and accurately discard all of the worms withexecutable attachments that her system sends me.

It is EASY to do discrimination like that. And if I don'thave ANYBODY set to allow them to send me executableattachments, or HTML to use links or scripting for thatpurpose, then it's hard to see how such a rule can be"confused" or bypassed by a spammer. It's like trying toargue with a vending machine... you don't know untilyou've put your money in that it's not going to deliveryour purchase, and it's absolutely pointless to try toargue with it after the fact. :-)


[snip]

....giving the recipient the ability to atleast not see theSAME kind of stuff over and over again, if they chooseto use thosefeatures, demonstrates the ISP's trying to give the userthe tools to
reduce the frustration.
If you can _detect_ the "SAME kind of stuff" over andover again.

Sometimes that's very simple. For example, even the verylimited filtering the Webmail I'm presently using offersallows me the ability to tell it that I don't want to seeanything more from "Fifth Third Bank" (in either the Fromor Subject fields), where I am not a customer. Anythingfrom them is clearly spam.

It is frustrating that I can't use a SNOBOL/SPITBOL-likepattern to describe spam rules, but I readily admit thatmost users would rather not do anything that advanced.(That would CLEARLY however be my 'programming languageof choice' to write such a filtering tool, though). Evensomething relatively brain-damaged like RegEx patternmatching would be an improvement, although stillfrustrating.

Even the best content techniques aren't very good at thatanymore.

Not automatically, no. But if there is something the userconsiders adequate evidence of spam (say, a subject whichcontains both the words Penis and Enlarge) then it'ssatisfying to be able to at least block THAT case once andfor all. Lots of cases? Sure, but every new rule atleast saves you from seeing THAT one again.

Longer-term, more sophisticated rules would work better.But that's why one needs per-sender rules which includenot just From and Subject but also content within theE-mail message body too.

Or for another example, somehow an E-mail address of"errorstogep(_dot_)(_dot_)(_dot_)(_at_)domain" got on a spammer E-mail targetslist. Any E-mail which contains that anywhere in theheader is spam, even if it also contains my "good" addresselsewhere.

Correctly chosen and utilized DNSBLs do a vastly betterjob.

They're certainly better than nothing. But again, I don'tbelieve that it's possible to do GOOD blocking, andminimize collateral damage, without per-senderfine-grained whitelisting which also takes into accountthe body content of the individual E-mail.

And as a recipient, it's very frustrating to see dozens orhundreds of similar E-mails continue to slip through anISP's filters, especially when I can readily define a rulewhich IMHO adequately identifies them. As a user, I wantthat ability, and to not continue to subsequently seethose E-mails.


Gordon Peterson
http://personal.terabites.com

1977-2007 Thirty year anniversary of local areanetworking


_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg