ietf-asrg
[Top] [All Lists]

Re: [Asrg] Re: bounces, and anti-spam principles

2007-01-26 01:35:44
Before trying to reply to this whole thing, you might want to scroll forward to Comment #6. You seem to be assuming various things about how
filters have to work that simply aren't the case.

We'll get there.... :-)

You _can_ handle false positives 100% without the recipient controlling the rules or even necessarily know that a false positive has occured. In effect, reducing your false positive level to zero, without any
recipient involvement.

Our overall system is designed towards _zero_ false positives. In at least one way, we're theoretically (and near effectively) there. With
DNSBLs and other techniques.

Obviously, I'd have to reserve judgement until I've experienced your "zero false positives" idea. Let's just say that I've got a healthy degree of skepticism.

In fact, in most ways we're far more aggressive than other filtering schemes. We can afford to have the filters misfire on legitimate email.

Simply because we've paid as much attention to finding out about misfires and remediating them as we've done with filtering itself.

Certainly making the consequences of questioning a given E-mail "acceptable" from a user standpoint gets you a long way.

What happens after being blocked is just as much part of a well designed
filtering strategy as the filters are.

Agreed.

[comment #1]

In any case, I still contend that simplistic blocking by IP address or domain name is a very poor approach, and for a whole variety of reasons.

I will contend that there cannot be a content filter that can reliably separate spam from non spam.

It doesn't NEED to be 100.000% accurate.

Nor does any other form of filtering then.

Right. If the USER considers it okay (and that's to some degree a moving target!) then it's "okay".
The bulk of mail most people receive comes from people they are familiar with, and which fits certain patterns. A given sender (mailing list etc) will typically have a signature file, for instance. I know that Aunt Matilda is NOT going to send me an E-mail containing a JavaScript decryption routine, or an ActiveX enclosure. She also is not going to send me an executable attachment. If stuff like that arrives here, it is safe to presume it is NOT from her, no matter what the From: address
says (and even if it WAS sent from her computer).

This does not work in the large scale beyond a limited subset of users.

Why not?

Not everyone has that small a set of correspondents to cope with, and
the "new correspondent" issue remains a big problem.

The total set of correspondents doesn't have to be small. It just needs to have a relatively small number of NEW correspondents that NEED to use "advanced"/riskier features and therefore require whitelisting. Ideally, each such correspondent only requires ONE click (one time) for the user to agree to allowing them to use the more advanced features. That might be done following an initial negotiation E-mail where the sender introduces themself and requests the ability to send more elaborate mails.

Introductory E-mails should NOT automatically presume the desire or willingness of the recipient to receive HTML-burdened E-mails.

I guess that depends on what you call "bulk", and how you propose to detect it. Again, whatever rule you put into effect (on a global-type basis) is going to be discovered by spammers and they will engineer their sending patterns to avoid violating it. That's why you need a really narrow and twisty 'gauntlet' they must negotiate, with DIFFERENT RULES for different recipients, where they don't know and basically can not figure out what rules they would have to comply with to get a
message through to a particular person.

Having had almost 4 years of intimate operational experience with what is probably the most effective single anti-spam filtering method there's been (one specific DNSBL), I can assure you that it is both more reliable in terms of FPs than any content filter I've ever dealt with (we've always run hybrid content+source+other techniques), I can also assure you that its effectiveness is _not_ declining - in fact it's
getting better.

The main problem with content filtering is caused by ruses based on HTML and attachments. These techniques serve to obscure the content of the E-mail, and ALL BY THEMSELVES the presence of such content in E-mails (at least in E-mails from unfamiliar senders) can be a priori evidence of hostile intent, or spamming.
Is it perfect? No. Does adding other filtering techniques help? Yes. But to claim that it's useless/trivially defeatable/too many FPs/trivial
to do better turns out to simply not be the case.

Certainly there are always special cases/special situations where just about anything can be made to look good. That's like SPF, which is claimed to work well (but ONLY for the subset of the situations that it works for).

The trick is to stop accepting mail from that IP address only until it has cleaned up.

Again, when you have a LOT of users (and possibly MANY servers) behind a NAT router, denying mail from that IP address results in simply too much collateral damage. More to the point, it's a very blunt instrument for the job, and it's relatively simple to do very much better.

So far, there's no indication of the latter being true.

I do note you don't refute my objection on principle.  :-)

Show me a
methodology that (a) doesn't abuse innocent third parties and/or (b) requires personal twiddling simply not achievable in the large scale that's better than the CBL (or for that matter Zen), I want to hear
about it.

I don't consider that ANYBODY has the right to blindly haul off and presume that I'm willing to accept HTML-burdened E-mails from them. So if YOU believe that such a restriction is "abusing innocent third parties" then we already disagree.

I also don't agree that my approach requires an unacceptable level of "personal twiddling" although it certainly DOES involve some degree of some recipient involvement, IF its capabilities are to be optimized. If the software is well-designed, I don't believe that it will put off most users, especially if they have fought spam for any length of time before.

Once the spam is gone there is no need to block the address unless it has proven to be a repeat offender without an effective process for
shutting the spammers down.

What about when the flow of spam is interleaved with all sorts of
good/important traffic as well?

You make a cost-benefit analysis and/or apply other techniques. Even
content.

Why the insistance on choosing only one? You don't have to.

Agreed. I never proposed my fine-grained whitelist all by itself. It works best in conjunction with a good content filter (and in fact greatly improves the efficacy of the content filter).

Effective spam filtering is best done with a hybrid solution. No one
technique is complete on its own.

Agreed.

There is a lot of spam which is obvious. That includes messages which contain links to known-spam-promoted Web sites (at least in the absence of contradicting factors, say being from a list discussing spam senders!)

SURBLs are blacklists too, and can be equally as blunt as a source IP. Or are you contending that the user should be explicitly entering them?

I don't really claim either way, but I WILL say that there are a lot fewer valid links to spam-promoted Web sites than there are randomly-generated counterfeit "from" domains and E-mail addresses.

With _individual_ spammers using 1000 or more domain names to advertise
the same thing, how effective do you think that can be?

Again, it's only one of a variety of criteria that can identify spam. It's clear to me, though, that just using From: and Subject: lines are by themselves NOWHERE NEAR adequate, particularly if you want to receive E-mails from previously unfamiliar senders.

It also includes, for example, messages which are identical to messages that some number (dozens? hundreds?) of other recipients at the same ISP
have already reported as being "spam".

Have you done any work with checksumming/hashing spam, ala DCC or Razor?

Not in that context, although I've used them in other contexts...

Yes, they're useful. But these days most high volume spammers randomize content such that even highly developed de-hashbusting techniques don't work very reliably. Doesn't even work on graphical spam anymore.

That's a good reason to simply make graphical spam no longer viable, by defaulting to not allowing images (whether attached or embedded) from unfamiliar senders.
One would think that ISPs could
locate and perhaps recategorize identical messages (again, perhaps tempered by a specific recipient rule) which are still queued and have
not yet been delivered to their remaining customers.

Yes, it will yield some results, but the overall results are highly
disappointing.

A lot depends on how big the ISP in question is. Within a company of 15 users, that won't help much. For a big ISP like Google or Yahoo or Hotmail, flushing all remaining similar/identical unread E-mails which have been discovered to be spam (by user reporting, as Yahoo does) can quickly purge them from the Inboxes of other users who haven't yet downloaded or read them.

But let me state again (and this is part of what made me respond, starting this sub-thread) is that it is virtually NEVER a good idea to send a bounce message after-SMTP-time, because you can't be sure where to send it, and most likely you are just harassing another innocent
victim.

That's something I'll certainly echo.

Thanks... it is VERY frustrating to me to have Yahoo start blocking all my outgoing Groups mail every month or so just because SOME virus-infected mail with a bogus return address gets bounced back to them as a "hard bounce". :-((

Hard bounces should NEVER be sent based on content, especially when you can NOT be sure who to send the hard bounce to. And they should NEVER be sent back based on detection of a virus/worm which is KNOWN to counterfeit return addresses!!!

Being able to "slam the phone down" on miscreant IP blocks at the accept() or helo is much, much, less processing than going thru the entire SMTP interaction and whatever it takes to pass processing off
to an end-user.

It's true that it costs less, but it's also true that it blocks a lot of
innocent and legitimate mail that might be
originating from the same IP address (NAT router?).

This generally doesn't turn out to be a significant issue in practise.

Tell that to the big company that will (eventually) find ALL their outgoing E-mails blocked (possibly for weeks) because of one or more infected systems behind their NAT router. This kind of nonsense could literally put some kinds of companies out of business.

If you spend the time you _should_ be spending to research the DNSBLs
(say) you plan on using.

I would not use that, because I consider IP-based blocking to be ultimately a fatally flawed approach.

I don't think it much matters HOW it is done.

There could be
dozens, hundreds, or even thousands of innocent users affected.

There could be, if the DNSBL is built in such a way that it's susceptable to that. But it doesn't have to be. And the good ones
aren't in any meaningful way.

How can it NOT be, if a given IP address is sending 50% spam and 50% legitimate traffic, all interleaved?

Sorry, I don't agree.

Even 2% legitimate traffic is too much to block, if that 2% is critically important stuff.

[incentives to clean up infected machines]

...the problem is that after (first!) SMTP time, the (intermediary, or final) recipient doesn't really know who they ought to notify...! Notifying the wrong person, or someone who has no control over the situation, probably does more harm than good.

You don't need to notify out-of-band. Presuming that recipient systems use DNSBLs in an appropriate fashion (inline rejects with pointers to more information), legitimate senders find out that it's blocked and why. That's their notification.

Again, I don't consider it acceptable to block good mail because someone ELSE using the same server or gateway is misbehaving.

With decent DNSBLs, that's sufficient to initiate resolution of the problem. It sure is a lot easier than tweaking Bayes or SpamAssassin (we specifically reject whitelisting
email addresses because of the forgery problem)

I consider a blanket "whitelist" as nowhere near adequate, because too many addresses are opened up. That's why whitelisting must be done on a far more finegrained basis.

Also, there's a remark somewhere on the CBL web site that impressed me with its simplicity - something to the effect that "we fully expect the vast majority of infected/blocked users _never_ notice that they're listed, because they're using their provider's smart-hosts as they should". If an ISP wants to proactively scan for infected IPs to see
about getting them fixed, they can do that too.

You presume that spammers can't send out their spam using the legitimate user's "certifications" and "smart hosts" (and their "micropayment account" if pay-per-email were implemented). I don't believe that's an accurate assessment. (Even if they didn't today, they can change their strategy overnight to do so.)

[comment #4]

I'm skipping this one because it takes too long to comment on ;-) other
than:

 Executive summary...

- blocking email, because it meets some technical criteria, is easier
   on the technical side, but introduces legal problems

It may, perhaps depending on exactly what the technical criteria _is_ and the rationale for blocking it, but taking risks when the business
case/result justifies it, is what people do.

If you NEED to do that, sure. But I consider it a poor bet if there are better approaches which don't leave you exposed like that.

You can forestall a lot, for example, by simply saying "it's our policy to reject email from IPs that appear to be dynamic". That says nothing about the spammyness of an individual sender, and if mistaken in
"appearance" simply needs to be fixed.  Or not.

Obviously you can set your policy however you like, but when it gets too cavalier, expect angry customers to start bailing.

Again, there is good NO reason to do things that way. (IMHO at least).

- blocking email, because the customer said so, may be harder
   technically, but avoids legal problems

The ISP could equally establish a infrastructure where the customer
explicitly delegates filtering decisions to the ISP.

Sure, but don't be surprised when customers decide they don't like the ISP's decisions, and it wouldn't take much for an unhappy sender to team up with an unhappy recipient customer and sue the ISP.

And the protections in law for ISPs to be held harmless for mistakes in good-faith filtering go a long way to shoot down any attempt even where the customers haven't formally delegated. It ain't easy for a sender to
prove bad-faith.  An ISP hasn't been sued in ages.

I wouldn't want to be the first one to break that record.

[snip]

....knowing that all the classical ruses to avoid spam classification (text as image, embedded links, attachments, scripting, disguised HTML links, etc etc) are a priori denied them.... certainly
takes a major bite out of spammers.

It would if you could. How can you tell that an image is text?

It doesn't matter. You simply block (by default) ALL images (whether embedded or attached) coming from unfamiliar/first time senders. If they want to send images, they first negotiate that permission with the intended recipient. (And that permission, of course, can be revoked by the recipient if the privilege is abused).

Blocking spam with embedded links or attachments would probably put us
out of business.  More likely get me fired.

It will certainly put MANY spammers out of business, and put virtually ALL virus/worm authors out of business... at least as far as E-mail delivery goes.

I don't have a problem with a recipient ALLOWING a particular sender to send them that kind of stuff... as long as an introductory E-mail (in plain text) negotiates that permission with the recipient FIRST.

And, a given recipient might configure his setup so that he could look at E-mails that would have been blocked, to decide to allow them in the future... or he can just take a hard line and say "only if approved first!" But in any case they could be marked as suspect/blockable to warn the recipient, before they open them.

There is NEVER the NEED to send HTML/attachments/scripting to someone in an INTRODUCTORY E-mail. Not until you've established that they are willing and able to receive that kind of content from you.

And only allowing executable attachments, HTML, and "big" messages from known/trusted senders basically eliminates E-mail as a vector for
virus/worm propagation,
"known/trusted" senders by what measure? Explicit listing of them?

Sure. Most users will not EVER need to approve ANYBODY to send them executable attachments. ("listing" can be achieved by clicking an "allow this sender to send me content like this in the future" button.)

Well, I can tell you about lots of viral attachments from "known senders".

Absolutely, but MOST users don't need or want to accept EXECUTABLE attachments from ANYBODY (including "known" senders). That's part of why it is a FINE-GRAINED whitelist and not a crude yes-or-no whitelist (which I agree completely is NOT adequate). I might accept JPG attachments from Aunt Gertrude, but it's a safe bet that any EXECUTABLE attachment in an E-mail (claiming to be...) coming from her... well, I wouldn't touch THAT with a ten-foot pole.

I really don't understand why that's such a difficult concept for people to grasp. That (VERY!) simple rule can basically eliminate viruses and worms (again, at least by E-mail) virtually overnight, if adopted net-wide.

Given that, it seems to me to be professionally IRRESPONSIBLE to NOT implement it. That is the biggest, fastest thing that the IETF could do IMHO to cripple the recruitment of spambot zombie armies.

[comment #5]

That's certainly true, and one advantage of fine-grained recipient blocking is that it doesn't require any great worldwide consensus, nor
any re-engineering of Internet infrastructure.

Nor do DNSBLs ;-)

Again, I consider (IP-based, at least) blacklists to be unacceptably crude. You're locking the door after the horse has left the barn. A whitelist is better because a first-time spammer is still left facing a locked door.

What WOULD be helpful, though, would be a recognition by the IETF that:

a) such fine-grained per-sender by-recipient blocking (and hopefully augmented by subsequent content scanning) is an effective and desirable
approach to the problem, and

As I've been saying, that has yet to be established.

I believe in any case that it is (by FAR) the most promising approach. NOT by itself, but in conjunction with content analysis and possibly other techniques. But it is the primary key to make the other techniques more successful.
b) in the general case, blocking of all non-whitelisted E-mails containing HTML, scripting (probably covered under HTML... is it possible to put in scripting without HTML?), or attachments is a "best practice". (It is probably a good idea to suggest including a maximum message size, too, as a way of preventing "denial of service" attacks by sending big E-mails to someone which would be expected to fill their E-mail inbox to overflowing, blocking subsequent legitimate E-mails).

Obviously, you've not had to deal with the legitimate mail traffic of a large corporation. I mention those measures as comic relief at meetings, because it always produces hysterical laughter. It'd shut us
down.

Again, I don't consider anybody as having a legitimate need to send HTML or attachments in first-contact E-mails. If they want to send that stuff, then they should negotiate that permission with the intended recipients first. And they should be prepared for some recipients to say "No, thanks, I'd rather receive plain text."

If it's a question of offending those who "laugh hysterically" or shutting down spam, I'll opt for shutting down spam.

And I believe that others would agree.

So if you think that such a rule would shut your company down (given that a user COULD grant you the permission to send them such mails, after a preliminary meeting and negotiating that) then I propose that your company's long-term future is not bright...! I suggest you get them thinking about how your company could adapt to survive in such a more-responsible world.

[snip]

Again, the problem is the degree of collateral damage that IP-based
blocking produces.

You haven't demonstrated what that degree _is_. By long exposure, I can assure you that the degree is surprisingly low. If you do your homework
as you're supposed to.

If you block the mail from a big company's server(s) because one user within that company has an infected computer that's been recruited as a spambot, I don't think it's a great consolation that "gee, we hadn't seen that happen before." Especially if the block thus established causes serious financial harm to said company.

We're receiving 1-2 million emails per day. 80-90% of that is spam. We have less than 10 FPs per month against Spamhaus' sbl-xbl (which is doing about 85% of our filtering). We've arranged things so that the sender finds out if they're blocked, and there's a well established procedure by which they can notify us and we can override listings.

If an email is blocked, they contact us, and we forward their email and
fix the listing, is it really a FP?  No.

If every sender correctly interpreted the error message they got and
followed through, then there'd be zero FPs.

Some VERY large ISPs (such as Yahoo) interpret a "hard bounce" message due to an infected E-mail as "this E-mail address is bouncing" and they disable sending all mail to the user involved. Stupid? Yes, of course. But YOU try to convince them to change that policy. They are totally non-responsive.

Worse, they only provide the now-disabled user with the SINGLE LINE saying the 5xx bounce reason message... saying what virus was found... thus NO clue as to WHO bounced the mail back, or where else it might have come from. So, of course, this is totally useless.

I wish our content filters even remotely approached being _that_ good.

If you didn't have to contend with HTML (and the various ruses that allows to obscure content), and attachments, they would be FAR more effective. Not allowing HTML from unapproved senders would also negate nearly all pfhishing exploits, since you couldn't have misrepresented links to click on, or buttons that take you to (hidden) disreputable Web sites using (invisibly-)obscured URLs.

[comment #6]

[how users configure their whitelist rules]

The problem being that out of the 60,000 seats here, perhaps less than 10 of them are able to competently configure a set of rules like what you have.

That's a software implementation issue, not an inherent problem in the approach. I envision a button to click on that simply says "allow E-mails like this from the same sender in the future" and where the software will open the keyway JUST enough to allow that type of message if seen again from that sender. How that recognition is accomplished, whether by something crude like simple GREP-type scanning, or something brain-damaged like RegEx pattern matching, or something still more sophisticated like the pattern matching SNOBOL/SPITBOL offers, or even a different sort of statistical ranking/rating approach like content scanners use... will vary from one implementation to another. The final
products will probably use a combination of techniques.

I'm sorry, this simply isn't a human interface issue. No amount or technique of per-sender whitelisting comes remotely close to the accuracy of our production filters, entirely aside from the new
correspondent issue.

It's not a simple "per sender" whitelist. It is a fine-grained per-sender PERMISSIONS list, which enables some types of HTML tags and SOME types of attachments and SOME other characteristics, on a sender-by-sender basis.

I don't believe it is possible to come up with as high an accuracy on ANY "one-size-fits-all" rule, "production" or otherwise. There are some mails which I would accept from familiar senders, where the SAME mail (same content, same subject line) would be spam if it came from anybody else. You simply cannot possibly know who I would accept that mail from, and who I wouldn't, unless I tell you.

You could give our users common filtering software (the reader is already pretty much standardized) with every filtering knob known to man, and perhaps three of our users could approach the effectiveness of the production systems.

I'm not suggesting that users have to "roll their own" from ground zero. Agreed that 90% of users will have 90% of their senders operating on the defaults. And your "production systems" can of course be tweaked and upgraded as spammers adopt new techniques.

That doesn't in any way demonstrate the lack of value in what I am proposing.

I'm only including _me_ in that list because it's me who built the production systems... Over a million decision
items are being changed every day in our filters.

Our users simply don't know effective spam filtering techniques. They push the wrong button, twist the wrong knob, and they're blocking something business critical. Or trusting malicious content with forged
credentials.  Or simply trusting...

Agreed that it ought to default to "simple". But the defaults can still be HIGHLY effective.

There is VERY little excuse for ANYBODY needing traditional antivirus scanners for incoming E-mail. That kind of garbage ought to be, and can easily enough be, squashed outright by responsible ISPs.

My favourite incident was the user who repeatedly insisted that he needed to receive the "important information from the FBI" that was sitting in his quarantine, and the quarantine forwarder refused to
forward to his mailbox.

Sorry, I said, but as much as you may want to see it, forwarding the
virus is a really bad idea ;-)

Obviously most phishing exploits work by "social engineering" to convince a user to let them in, or to provide them information they shouldn't provide. And some users will always fall for the "Aggie Virus" approaches (like users who forward hoaxes).

I think we can agree, however, that by default delivering E-mails to clueless recipients that contains executable attachments is a Really Bad Idea too. The trick is to decide how you can implement that restriction, while still allowing them to be sent in the 0.01% of the cases where the sender expects and trusts and WANTS them from a given sender. I believe that my fine-grained whitelist answers that need, and does it (VERY) well.

Again, I consider IP-based blocking to be inherently flawed, to the
point where I consider it a dead-end.

It's a remarkably vigorous dead end ;-)

If the only tool you have is a hammer, you'll try to make every problem look like a nail.

Popular does NOT mean that it's good.

[snip]

Yeah, I admit that I usually at least cast a cursory eyeballing of the Yahoo mail "spam" folder too, rather than just emptying it. Occasionally I -do- find a non-spam message there. (Although that happens seldom, as I almost never give that E-mail address to anybody... It's almost useful as a "personal honeypot" to see what's being spammed out, before going to my more usual E-mail accounts and possibly wondering if that curious E-mail just MIGHT be legitimate).

"Almost useful" is the key. When you have users whose spam load ranges from one or 2 per month, to 4000+ per day, you can see that junk folders have only limited usefulness, and not to everyone.

Agreed. That's why users need to be able to choose what level of acceptance of the automated filters is most acceptable to them, and what level of time they are willing to devote to "questionable" mail.

Nobody can find
legitimate email in a 4000 spams/day feed, now matter how the filters are implemented.

True enough, but if a person is willing to simply agree that they don't want to see HTML or attachments from unfamiliar senders, that instantly blocks a LOT of the garbage. Simple, and HIGHLY effective.

AND it makes the remaining content-based filtering FAR more effective, by blocking stuff like 'text as image' (whether attached or embedded as a link... or even generated by a decrypting script).

And again, they can relieve that restriction on a sender-by-sender basis, if they need and want to, based on previous negotiation with such senders.

Perhaps part of your problem is that you're not seeing the big picture of how you can _use_ DNSBLs or any other filtering technique.

I've been involved with BBSes and dealing with spam since the pre-Internet days, but please feel free to educate me if you think there's something worthwhile you can teach me. :-) I'm not too old a dog to learn new tricks.

Your remarks seem to imply that DNSBLs necessitate no notification anywhere, the email just disappears. Or that filtering in general is
that way.

There are clearly all kinds of ways to deal with "questionable" mail. That's something that is going to depend on how the software is implemented, and what options the recipient selects. There is a lot of room in this area for programmer creativity.

On the contrary, ours have never done that. Indeed, without anybody looking at quarantines, without anybody personally twiddling filters, that 4000 spams/day user _does_ get the email that was accidentally blocked. Simply because we do inline rejects with instructions on what
to do, and problems get fixed fast and without harm.

Some antispam filtering obviously works better than others.

IMHO, if you don't have a finegrained permissions list on a per-sender basis, yours COULD work BETTER than it does today.

[snip]

The important thing is that the RECIPIENT
controls that, so they can decide the rule that determines what gets blocked and what gets through. That way they don't have to wonder what
SHOULD have been delivered to them and wasn't.

But what if you arrange things so that the recipient doesn't have to
control the rules, and still doesn't have to wonder?

I think that a per-sender rule can ALWAYS make things better and more accurate. That doesn't negate the value of good pre-sets and defaults.

But again, if by default you let executable attachments from unfamiliar senders get delivered (or if you block them for ALL senders) then your software COULD do better.

Our false positives are handled usually without the recipient even
knowing that something got blocked.

OK, so how do you handle them, specifically?

Spammers have gotten good at throwing enough random junk into E-mails to
confuse Bayesian filters.

And sender whitelisting and...

Again, don't confuse a simple sender-whitelist with a fine-grained per-sender PERMISSIONS list. The difference is absolutely night and day.

Even if I allow Aunt Matilda to send me JPGS of her poodle, I don't care how many worms her infected computer sends to me with HER from: address and via HER mail server. My approach will let me see ALL of her real mail, and rapidly and accurately discard all of the worms with executable attachments that her system sends me.

It is EASY to do discrimination like that. And if I don't have ANYBODY set to allow them to send me executable attachments, or HTML to use links or scripting for that purpose, then it's hard to see how such a rule can be "confused" or bypassed by a spammer. It's like trying to argue with a vending machine... you don't know until you've put your money in that it's not going to deliver your purchase, and it's absolutely pointless to try to argue with it after the fact. :-)

[snip]

....giving the recipient the ability to at least not see the SAME kind of stuff over and over again, if they choose to use those features, demonstrates the ISP's trying to give the user the tools to
reduce the frustration.

If you can _detect_ the "SAME kind of stuff" over and over again.

Sometimes that's very simple. For example, even the very limited filtering the Webmail I'm presently using offers allows me the ability to tell it that I don't want to see anything more from "Fifth Third Bank" (in either the From or Subject fields), where I am not a customer. Anything from them is clearly spam.

It is frustrating that I can't use a SNOBOL/SPITBOL-like pattern to describe spam rules, but I readily admit that most users would rather not do anything that advanced. (That would CLEARLY however be my 'programming language of choice' to write such a filtering tool, though). Even something relatively brain-damaged like RegEx pattern matching would be an improvement, although still frustrating.

Even the best content techniques aren't very good at that anymore.

Not automatically, no. But if there is something the user considers adequate evidence of spam (say, a subject which contains both the words Penis and Enlarge) then it's satisfying to be able to at least block THAT case once and for all. Lots of cases? Sure, but every new rule at least saves you from seeing THAT one again.

Longer-term, more sophisticated rules would work better. But that's why one needs per-sender rules which include not just From and Subject but also content within the E-mail message body too.

Or for another example, somehow an E-mail address of "errorstogep(_dot_)(_dot_)(_dot_)(_at_)domain" got on a spammer E-mail targets list. Any E-mail which contains that anywhere in the header is spam, even if it also contains my "good" address elsewhere.

Correctly chosen and utilized DNSBLs do a vastly better job.

They're certainly better than nothing. But again, I don't believe that it's possible to do GOOD blocking, and minimize collateral damage, without per-sender fine-grained whitelisting which also takes into account the body content of the individual E-mail.

And as a recipient, it's very frustrating to see dozens or hundreds of similar E-mails continue to slip through an ISP's filters, especially when I can readily define a rule which IMHO adequately identifies them. As a user, I want that ability, and to not continue to subsequently see those E-mails.

Gordon Peterson
http://personal.terabites.com
1977-2007 Thirty year anniversary of local area networking

_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg