RE: [Asrg] Fwd: Major E-mail Delivery for FTC DNCR Launch

From: Yakov Shafranovich [mailto:research(_at_)solidmatrix(_dot_)com]


<snip>

If the infrastructure of the Internet disconnects abusers, and the
definition of abuse is controlled by the recipients, then the

economics

will be well defined for the spammer: If you abuse this

network you will

be removed from it and _no amount_ of money will get your messages to 
unwilling recipients. It will also be well defined for the provider: 
Implement these protocols or lose customers to those who do.
[..]


As I mentioned in a different thread there seems to be two types of 
solutions to the spam problem: one focusing on the senders and 
the network 
(network abuse model), and the second focusing on the 
receiver's end only 
(consent model). I think that both are equally valid and we should be 
looking at both, and restrict ourselves.


Actually, what I am suggesting is that there should be a unified model
where the consent model can (essentially) drive the network abuse model
via a collection of open protocols. I will attempt to summarize:

Each terminal user would feed policy decisions (or not if they are lazy)
to their local MTA. The local MTA aggregates those policy decisions and
responds accordingly while allowing for any exceptions to specific end
users. (Lazy users would accept by default the aggregate policy of the
system, or some other standard defined by their relationship with their
provider).

Failure to respect negative responses, or to fail to properly execute
other protocols can be monitored by the MTA and would be seen as abuse.
Other tests could also be added as abuse detection mechanisms (port scan
detection, dictionary attacks, delivery or attempted delivery of
malware, etc..) MTAs with compatible policies would collaborate to share
common elements and would collect sensory data on abuse as well as
diseminate that data as needed with peers. Peers would be selected based
on COT (Circle Of Trust) protocols whereby each participating node
continuously evaluates the performance, consistency, and policy
proximity of it's peer nodes and migrates to new peers as needed. This
mechanism provides for no single point of failure as there is no
centralized authority. It also overlays automatically with an optimized,
distributed processing facility for threat detection and collective
response.

(NOTE: Nodes need not be a single MTA, MUA or otherwise. In fact, an MTA
in a large scale system might support many "nodes" so that it can
aggregate the policies of it's end users based on similarities between
groups to simplify it's local delivery rules. Other devices may also
play a part depending upon local policies.)

Aggressive abuse would trigger aggressive measures such as blocking
inbound access from the offending ips and/or networks. (DSQP = Dynamic
Squelch Propogation). At first, a local MTA would refuse connections at
an envelope level if a "more terminalward" MTA (or MUA or MUA agent)
refused the content. If the refusal was not heeded or if other abuse was
present sufficient to meet the local policy thresholds, then the MTA
would initiate a DSQP request to it's border routers to refuse access
from the offending network. Collaborating nodes within a high trust
level of the COT would be allowed to aggregate individual IPs to network
blocks appropriate to fit their collective requirements and optimize
network level controls at the border routers or other similar
structures.

DSQP news messages might be broadcast to peers depending on local and
peer policy. Peers might then aggregate or simply act on similar news
messages by issuing their own local DSQP. In this way, for example,
worms  could be detected and blocked before they could spread.
Similarly, unwanted spam broadcasts might also be detected and slowed
for analysis or entirely blocked based on the wide detection of abuse.

For example, a network provider might have a COT/DSQP policy such that
some threshold of DSQP requests for a given network at the network's
edge might result in a broad propogation of DSQP requests directly to
the sources for that traffic. For example if a large ISP were to detect
DSQP-Malware signals from 5% of it's network clients then it might have
a policy to block the source network named in the DSQP at the border
routers thus temporarily denying transit across it's network. A similar
action might occur at a higher level for DSQP-Spam reports with a
specific characteristic (such as a new unknown source reported by 10% of
client networks - again, local policies prevail [ the provider might
know that only 15% of it's client networks have these protocols in place
so a 10% showing would indicate a very high threat ]).

Collaborating intermediate providers might therefore block serious
network threats and DDoS attacks automatically based on detecting abuse
reports from client systems - and thus disconnect the source of the
abuse.

(NOTE: All of these protocols also relax restrictions automatically once
abuse subsides based on local policy decisions. In any case the
responses at each level would be metered by local policies in much the
same way BGP policies can be established to accept or deny routes from
particular sources under specific conditions... If someone makes a
mistake, they would be blocked, but once they correct that mistake they
are back in operation.)

The model also works well during rollout (before wide deployment)
provided the local policies of each peer are willing to accept
non-compliant (outdated) MTA traffic.

Consider an inbound message from a new sender (from the perspective of a
given MTA). The MTA would ask some number of peers in it's COT to
"vouch" for the sender. If the other peers in it's group show a good
rating for the sender then the message would be processed with little
bias and a new monitoring record would be developed in the local node.

If the peers show a bad rating for the new sender then the local MTA
might adopt that policy and reject the message. (for example if the same
sender had recently attempted a dictionary attack on one of my peers
then "I" would not want to accept any traffic from that sender.)

If the peers do not have any rating for the new sender then the local
MTA would process the message according to it's local policy for new
contacts. It would also begin forming it's own rating for that sender
based on the sender's performance, adherence to policy and protocol
requests etc.

One way this new unknown message might be processed is with an IRRQ
(Intelligent Retry Request) where the initial message is rejected with a
unique response that the sender can use in their next retry after an
appropriate guard time. 

(IRRQ modifies SMTP by indicating a temporary failure and providing a
one-time password for the sender to use in their next attempt. The
sender issues the password as it's first recipient in it's retries after
an established guard time in order to comply with the protocol. A
non-compliant sender would respond normally to the initial temporary
failure. Spamware would not remember the password and would probably do
a lot of other detectable things. An uncompromised MTA would at least
act in a predictable way (not retrying 30 seconds later for example). An
MTA implementing the protocol would most certainly not be spamware and
would be given a higher trust rating provided no other policies were
violated.)

If the sending MTA has implemented IRRQ and responds appropriately then
the local node might increase it's "trust" rating. If the sending MTA
does not implement this protocl then the local MTA would be somewhat
more biased against this incoming traffic as determined by it's local
policy. If, on the other hand the new sending MTA began pounding the
local MTA with bad requests (or other detectable abuse) then the local
rating for this sender would immediately plummit and no messages would
be accepted (escallating as indicated above if necessary). Since this
local MTA now has a rating for this sender, if the sender attempts to
contact on of the peers in the COT then those peers would receive the
negative rating from the local MTA and would start their evaluation of
the sender from that perspective.

--- summarizing the summary ---

The important characteristics of this solution are:

- End users ultimately define the nature of abuse (spam or otherwise)
within their local context, which includes their immediate provider.

- Abuse can be detected more accurately and sensitively because multiple
systems across the Internet are involved in the detection process.

- Broad defensive actions against abuse are enabled.

- Individual processing costs are reduced significantly by leveraging
network effects. (For example, only one well trusted system need detect
a virus for an entire group of systems to reject contact from that
source. The detection must only occur once, not once for each system.
This leverage also applies to content rejection of spam whereby systems
with very similar policies might adopt a block after unwanted material
is detected on a well trusted system (unless a local policy forces an
exception) This is true if the spam detection is a person rejecting the
message or an automated system of heuristics or some other policy
mechanism).

- Any protective response against abuse is driven as close to the source
as possible thereby protecting the widest possible area of the network
and significantly reducing the possibility of damage or other costs.

- Administrators close to the source of an abuse are given strong real
time data that can help them identify policy violations and take
appropriate actions (legal or otherwise) within their local framework.
This might be particularly important where legal actions are considered.
Note that system operators overseas who consistently allow abuse without
acting effectively will automatically be rated poorly by systems
implementing these protocols... that may well result in their effective
removal from the network as broad policies develop to reject their
network - this puts a very specific spin on the economics of network
service operation, content publication, and otherwise.

- The system is extremely difficult to compromise or corrupt because
there is no single point of failure and every node is under constant
observation by other peers. If a node becomes compromised then it's
peers will reject it based on their local policies and can even be made
sensitive to this possibility by sharing their observations... A
compromised node would be seen as "losing trust" with all of the other
peers simultaneously and so each peer could be programmed to be ultra
sensitive to this condition. In any case failure can only degrade the
system to the degree allowed by all local policies - so there should be
no possibility for a large scale failure scenario.

- The system automatically adapts to policy changes... If a peer changes
it's policy and that policy is no longer compatible with your local
policy than you will reduce that peer's trust rating until you
eventually drop it from your COT. That does not mean that you won't
accept content from that system (provided that content doesn't violate
your policy) but it does mean that you will not accept that system's
policy recommendations nor seek it's advice on new contacts.

... There is a lot to this, I hope that the above description provides a
good starting point...

_M

PS: I am working on papers to describe these concepts in more depth and
more clearly.


_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg