On Jan 25, 2011, at 5:06 PM, Dotzero wrote:
On Tue, Jan 25, 2011 at 4:16 PM, Paul Ferguson
On Tue, Jan 25, 2011 at 1:14 PM, John Leslie <john(_at_)jlc(_dot_)net> wrote:
Reputation (as the name implies) is a prediction of the likelihood of
...based on previously observed behavior.
So, what exactly does this mean when behavior suddenly changes? If a
domain or IP address (was well behaved yesterday) but begins spewing
badness today, what will your company do as an arbiter of whether mail
is accepted by your customers? Will you allow that domain or IP
address to spew badness?
"badness" is hard to measure at scale once you've removed the
obvious botnet spew from your mailstream.
I highly doubt it.
Depends. If it's sending a lot of wanted email, a spike of obviously
unwanted email and a middle ground of mail which you can't
decide about you're always going to want to deliver the wanted
email, and you're always going to want to block the obviously
But the mail in the middle is harder to decide what to do about,
and that's where sender reputation helps.
As some point once the
spewing has subsided you may (automatically or manually) again start
allowing traffic through from that domain or IP address. But that
isn't really reputation in the traditional sense of the word.
But that brings me back to my original question. If reputation doesn't
prevent a site from getting throttled or blocked when it goes bad,
what does reputation mean?
Reputation is, loosely, the past history of the sender from a decade
ago through to a second ago. It's not a simple integer (0=bad, 100=good)
however much people want to map it onto that. It includes, at least,
traffic volumes over time and fraction of email that was wanted vs
not wanted over time.
Comparing the past history of the sender (over a period of months)
to the current behaviour of the sender recently (minutes to hours) can
help you guess what the sender is up to, categorize them into one
of several "bins" automatically and treat mail from them appropriately.
If a sender has a history of not sending any email at all and you suddenly
see a lot of email from it then you can categorize it as "probably a
compromised end-user machine".
If it has a history of sending 1000 emails a day and it suddenly starts
spewing 100k then the change in behavior lets you categorize it
as "tiny smarthost with compromised system" and maybe block it outright.
But if a site has a history of sending you large volumes of wanted
email over a long period, and you suddenly see a spike of unwanted
email then you're likely to assume that it's a transient problem (bad
customer, perhaps) and that once you've notified them of the problem,
they'll fix it. Meanwhile you'll keep delivering most of the email from
them, on the assumption that it's mostly wanted.
They're all reasonable decisions. And they're things that can be
implemented as a set of business rules driven by, amongst other
things, the short term and long term history of the sender ("reputation")
in an automated way that doesn't require much per-sender configuration.
Plug in your policy, let it run, watch your statistics and tweak.
It doesn't particularly protect the site
from the immediate consequences of going bad. It appears that the
responses are authoritative (this domain or IP is currently emitting
badness) rather than reputational (this site has a good reputation so
I will accept badness from it on the presumption they are going to
address it). I will grant that there may be some small slack cut based
on reputation but does it really extend that far?
Yes. Filtering out "obvious" spam is easy. Recognizing "obvious"
1:1 email between regular correspondents isn't too hard.
Dealing with the big grey area in the middle is hard, and sender
reputation is about the only thing that gives you anything to base
delivery decisions on there, so for the big middle ground of email
good reputation will take you a long way.
Asrg mailing list