ietf-asrg
[Top] [All Lists]

[asrg] 6. proposal of solution: Using Relay Honeypots to Reduce Spam

2003-04-15 05:01:16

ASRG and the end of spam

This started as a honeypot proposal and still ends with a honeypot proposal. Along the way it grew. the growth portion is now first.

ASRG was created to deal with the spam problem. The proper way to deal with the spam problem is to end it, to end spam. ASRG has considerable influence that it can use to help bring about such an end. People expect it to make a proposal to end spam: if the proposal appears reasonable and workable people and organizations that want spam to end will cooperate in the effort to end spam. This means that if ASRG provides a framework for action at several levels by several kinds of participant the action is very probably going to happen. the ASRG proposal can anticipate cooperation from more than just those participating in ASRG and the organizations represented by those participating in ASRG. As an important corollary, if the spammers see that ASRG has formulated a plan and that internet forces are cooperating to bring that plan to fruition the spammers should conclude that the ASRG will succeed and that they will ultimately be stopped from spamming by the ASRG plan. That's perhaps difficult to assert or believe in the present environment but it should be clear that the described situation (spammers see no hope) should be the goal, if it can be attained. Part of the purpose of the ASRG is to determine a plan of action, part of the purpose is to determine how successful that plan of action will be and part of the purpose is to maximize the results of the plan of action, as finally formulated. What follows is presented as though it were that plan. What finally results may bear a great resemblance to this proposal or may completely supplant it. What matters is whether the final plan will be very likely to work, and will attract support from those whose support may be necessary for success, will convince the spammers that the days of spam are ending.

It is important to think of the scope of operations that ASRG can stimulate. The spam problem can be analyzed into its component parts, ASRG can make proposals for each entity that might act against a particular component of spam. There need not be a single, monolithic approach to ending spam. If the proposed actions are reasonable (and part of the ASRG task is to search for the most powerful reasonable proposal) then the entities asked to take the actions very probably will take action. (Note that the same reasoning suggested for the spammers applies to the ISPs that are in any degree spam supporters: there is very great reason to not want to be the ISP that was last to end its spam-support activities. One ISP will be: the goal is to motivate such ISPs to change behavior well before the end. If the end is coming and the profit can be seen to be coming to and end anyway the smarter path is to become vividly anti-spam.)

The ASRG solution can be comprehensive, can cover several stages of operation. ASRG can anticipate that some spammers, at least for a while, will resist the effort to end spam and will attempt to overcome the ASRG methods. While the ASRG proposal should be comprehensive enough to anticipate such moves the details of such moves cannot be anticipated. ASRG must, therefore, be prepared to modify or augment their program, as new evidence of spammer evasion becomes known. Again, the standard for action must be that it is action that will succeed: the new proposals of the ASRG should be convincing to those who will be asked to extend their efforts in order to thwart the modified spammer behavior. In short, the ASRG should continue watching the spam process and should be quick to analyze new spammer tricks and to devise ways to combat those tricks and incorporate whatever changes are needed into their plan.

"Entities" denotes classes of those who can take actions specific to their position to help end spam. ISPs that harbor spammers are an entity, ISPs in control of resources spammers abuse are an entity, ISPs that are at the target end of spam are an entity. End users whose systems are abused to send spam are an entity, anti-virus companies are an entity. Software companies and freeware providers may have a function in the final ASRG plan - they could make available any tools called for by the ASRG proposal. These entities obviously may overlap. the important point is that different actions can be taken at different points in the spam pathway and that the ASRG solution can, if necessary, call for action at many points, with the actions being those specific to the point. It is quite likely that part of the final ASRG comprehensive plan will be a call for the continuation and perhaps intensification of existing anti-spam techniques. The goal is not novelty but is the eradication of spam. ASRG can provide a framework and mechanism to enhance the interaction of anti-spam forces and to intensify the targeting of specific aspects of the spam problem. While not an "entity" as defined above the press and media may be important to the ASRG solution. It's not enough that principles and methods be agreed upon: action must be taken by sufficient numbers of the people and organizations with the power to take action that success results.

The first analysis of spam is easy: spam has two types, direct, and non-direct. Direct spam should be completely controllable by use of block lists, at least direct spam from spam-only sources. (Filters may also identify and stop direct spam - there seems little reason to consider abandonment of filters as anti-spam tools and the redundancy of the filters provides backup for the blocklists in the case that the spammer gets a new IP.)

The rest of spam, non-direct spam, has a dual nature: it is both spam and is also the product of abuse. Much of the abuse is possible because of the original model of the internet in which each user of the internet was assumed to be trustworthy. That trust is now often misplaced. The attempted solution to the loss of trust has mostly been to make individual systems no longer trusting (firewalls, secure MTAs, etc.) That has worked (for the most part) for individual systems but has not cured the overall problem of abuse. The common protections against abuse are passive and actionless. As such they create little incentive for the abandonment of abusive behavior. Perhaps ASRG can make a significant contribution by examining whether this single-system-centered passive approach is effective and appropriate. (Looking ahead) honeypots are also passive but differ from the traditional approach in that there are actions inherent in the operation of the honeypot and that further actions may be taken based on the results of the honeypot. (In the broadest view honeypots are action against a specific form of abuse and can expand beyond the single-system limit.)

A honeypot (particularly an open proxy honeypot) can be a man-in-the-middle defense. The spammer attempts to contact some resource through the open proxy honeypot. The open proxy honeypot could simulate that contact or it could allow it. In the SMTP dialog with the spammer's intended target the open proxy honeypot could function exactly as desired, with one exception. After the data has been transferred to the SMTP server contacted through the honeypot the honeypot could silently add an RSET to the transaction (other schemes to disrupt the communication are possible - the point is that it looks real and is real, with the exception that the result is nil. The open proxy honeypot could RSET after all recipients are specified and then create a single recipient of abuse@<the target>.) To the spammer it will appear as though everything functioned as normal - it did. The important exception is that the spam did not go through - it was obliterated. (For completeness the open proxy honeypot should fully log the transactions form the spammer-sender.) Whether the open proxy honeypot is being used to contact an open relay or is being used to send the spam directly to the victim doesn't matter: either way there is no spam delivered. The Open proxy honeypot need only detect that port 25 is being contacted through it and to modify the dialog such that no spam is transmitted.

The long-term ASRG plan may include a replacement or strengthening of the SMTP protocol. What ASRG does need not be limited to a single action. Protocol enhancement may be the key part of the ASRG plan but the current intense level of spam calls for immediate action yo substantially reduce the amount of spam that reaches all users, as a whole.

What follows was written in response to a message from Paul Judge. It starts by quoting what he suggested should be incorporated in a full honeypot proposal. It isn't all there, but most of it is. To me the most important point is the point made at the start of this posting: ASRG can take major action. Relay spam honeypots, as done, have been of the form of minor isolated action. If you broaden the concept to include action for an entire network segment the power of the approach is increased. While the description of honeypots is useful the power, to me, is in the idea of taking concerted action to end the abuses committed by spammers to send spam. Whether or not this results in a solution that looks like honeypots is unimportant. What matters is that spam be ended.

Paul's suggestions:
You should describe the benefits of R.H.
Describe how to deply one from scratch
        -who should do it? Large corporations? Individuals? A single
non-profit?
Give some metrics.
        -how much spam will one see
        -how many are needed to make a difference
Discuss countermeasures
Discuss deployment issues

Put this in the form of a document that someone can pick up and read. It
should convince people to run relay honeypots and show them how to do so. It
should include all of the ideas that you have thrown around in the different
emails.


Background:

That spam comes by abuse pathways is well known. Spammer Alan Ralsky is reported to control 190 email servers, including 160 in the US. (http://www.freep.com/money/tech/mwend22_20021122.htm) Ralsky does not send direct spam - he must therefore send spam via an abuse pathway. Similarly, many other spammers also send abuse spam.

Historically the abuse spam was sent via open relays almost exclusively. Recently these have been supplemented by open proxies and by open proxy - open relay combinations. Open relays are still used, are still a problem. When open relay DNSBL services shut down because they have no function it will be known that open relay abuse is no longer a significant part of the spam problem. Similarly, when open proxy DNSBLs shut down for a similar reason open proxy abuse will no longer be a component of the spam problem. Neither of these shutdowns seems imminent.

That these spammers use abuse means two things: there is a constant flow of abuse packets from their servers and these packets are directed to systems vulnerable to abuse that the spammers have discovered. That they discover vulnerable systems means that they look for vulnerable systems. No doubt different spammers have different practices - some may seek abusable systems only overseas, some may seek them only in the US, some may do both. Little data has been collected to determine what these practices are in general or for particular spammers (or groups of spammers, if any.) Operators who set up relay spam honeypots are generally successful in detecting and delivering relay test messages. Many operators of email systems report frequent log entries for rejected relay messages. The evidence is that spammers appear to check for open relays essentially everywhere. If it is necessary to the spammers then the practice creates a vulnerability on the part of the spammers. the intnet is to fully exploit that vulnerabilty.

Whether or not the abuse meets the federal standard for action the abuse is theft of service. It is appropriate to report such abuse to the ISP from which it originates, it is appropriate for the ISP to terminate service to the customer guilty of the abuse. Surprisingly, some ISPs seem completely unaware of the abuse and of its implications.

Honeypots, in this context, are systems set up to appear to the spammer to be vulnerable to abuse but not be vulnerable - some key part of the abuse is intercepted, usually delivery of spam. Functions of honeypots vary but include detection of the tests done by spammers to discover and verify abusable systems and delivering such tests so that the spammer deceives himself into believing the tested system is vulnerable. When the spammer so deceives himself he may send relay spam to the system, spam which the system makes sure is not delivered. The basic honeypot is a single system with a single IP address. Those in charge of larger aggregates (e.g., full network segments of /24 or larger size) may be able to create giant honeypots, in which port 25 (or proxy port) traffic directed to IPs that don't service the port is diverted to a master honeypot.

This document will emphasize open relay honeypots. Similar thinking can lead to design and implementation of open proxy honeypots.

An open relay honeypot can be, broadly speaking, one of two types. It can be a standard MTA configured or altered to make it a honeypot or it can be new code, specifically written to function as a honeypot. The design criteria of an ideal honeypot are simple: always look like an open relay to the spammer, never deliver spam. Additional criteria may be chosen that define the mode of operation of the honeypot. It might be designed to be fully automatic, for example, or might be partially automatic and partially manual. At the simplest level an open relay honeypot accepts and delivers the relay tests sent by spammers. As the honeypot is supposed to deliver no spam and as the act of delivering a spammer relay test usually leads to a spammer sending spam the honeypot needs to have a way to distinguish relay tests from spam. Several approaches have been used. In one approach each new message in the mail queue is examined automatically for relay-test-like character. The easiest mark of such character is the presence, in the message, of the IP address of the honeypot itself - the IP address of the open relay is the payload of the test message. This address can either be plaintext (standard dotted quad) or it can be encoded. A very frequently seen encoding is to re-express the IP address in its decimal ascii form and to put that before the "@" character in the message-ID. A third form of IP encoding is one in which the periods in the dotted quad representation are replaced by slashes and each digit of the representation is replaced by the next higher digit. Thus, 192.168.10.200 would become 2:3/279/21/311 (the : is what has been observed as the symbol for 9 + 1.) As these addresses always appear with the string MAILINF0 that string could be just as appropriate for recognizing relay tests, at least before spammers get clever.

A user deploys a honeypot by one of several methods. If the honeypot is based on a standard MTA the honeypot is installed on a compatible system that has no MTA but does have a network connection and an ethernet card with an IP address. If there is no such system an older unit may perhaps be pressed into service. Successful honeypots have been based on a 100 MHz 486 DX4, on a 120 MHz PENTIUM, and on a Vaxstation 4000/90. The honeypot need do nothing compute-intensive - it can be an older system of limited capability and still succeed. If the honeypot is one using honeypot-specific software it should be one compatible with the software, with nothing else using port 25, and no vital function that is put at greater risk by the implementation of the honeypot. An example of such a program is Jackpot: http://jackpot.uk.net/

A very effective honeypot is one that is substituted for an open relay that is already being abused. If the open relay is on a system that has no real need for an MTA the existing MTA can be stopped and an appropriate honeypot installed and run. If the system has an email function then its IP number might be re-assigned and the honeypot put on a new system that is given the IP number taken from the old system. Regular email follows DNS - spammers generally use IP numbers to contact the systems they abuse, so subsituting a different system for the one previously at a particular IP does work against the spammers.

A relay spam honeypot can be successfully run by anyone connected to a net segment the spammers test for open relays. This appears to include some or all IPs in the US, great Britain, Korea, China, Taiwan, Denmark, Germany - anyplace the spammers look for open relays is appropriate for a honeypot. This means top-level ISPs can do it, it also means home users with DSL or cable connections can do it. It is important that honeypots be run by more than just anti-spammers The real power of honeypots comes when they exist in large numbers - this requires that they be implemented beyond the anti-spam community. Honeypots can be simple enough that this is readily possible.

Honeypot users will see varying amounts of spam (if they deliver relay tests.) What they see depends on which spammer or spammers discover and decide to use the apparent open relay. Some spammers will hit a relay with a flood of as much spam as they can pump out. Others send metered amounts. At one time it appeared a volume of around fifty 20-recipient messages/day was typical for a honeypot (and an open relay) in the US (based on a less-than-representative sample of one.). Last February the Moscow honeypot started working and trapping orders of magnitude more spam. Another foreign honeypot received spam at high levels, with bursts of well over 1 million recipients/day for the trapped spam. (In the first year of operation that system stopped spam to 281 million recipients, with an average of less than 1 million recipients/day.) The trapped spam isn't all that is important - the relay tests are the problem and the key to the action of the honeypot. The real purpose of honeypots is to disrupt the ability of spammers to find open relays. A natural consequence is that most honeypots will also trap spam - that's because they disrupt the ability to find open relays by masquerading as open relays. If they don't accept spam the masquerade isn't very good. Nonetheless the goal is to make discovering open relays so difficult that the spammers give up trying. That will lead to the quicker end of relay spam. (Similarly for open proxies.)

While it is a secondary function the spam-stopping power of individual honeypots is important, does make a small difference. Compared to the total daily spam volume any small number of honeypots has little real impact. That partly works to the advantage of current honeypot operators: if they make no real difference the spammers won't notice that honeypots exist. Stopping spam is a side-effect of the real power of honeypots. RFC 2505 says that securing open relays is not an approach to ending spam. The reason is that spammers will continue to discover open relays, so that even a 95% success in securing open relays won't stop spam. The key word is discover: the problem is that spammers can discover open relays. Anyone can: try to relay an email message through a million IPs and you'll find some that will. Spammers work the same way: they look for IPs that will relay. When the only relay-level anti-spam countermeasure is to secure open relays the spammers see a simple division of systems on the internet: those that don't deliver their test messages and aren't open relays, those that do deliver their test messages and are open relays. It's very simple for the spammers: if a system delivers a test message it is an open relay with near 100% certainty. That's what makes detection of open relays simple for spammers. The real purpose of honeypots is to disrupt the detection of open relays. It can be regarded this way: as long as anti-spammers can build a good list of open relays spammers can do the same. In a way the spammers and the anti-spammers work cooperatively to build a good list of open relays. Spammers can and have used anti-spammer relay lists to find usable open relays to send spam. Anti-spammers add to their lists any open relays discovered by the spammers that become known through relay spam reports.

To really disrupt the spammers may take a number of honeypots equal to or greater than the number of open relays. If the numbers were equal then the honeypots would be expected to be receiving roughly half the spam, a 50% cut in spam volume. At that level the spammers need only double their output to keep the same delivery level. (Someday the headroom for increasing volume will run out - then nothing the spammers can do will keep up the volume.) This analysis neglects the complaints that many honeypot operators can send - complaints about attempted theft of service, complaints about relay test messages. These complaints multiply the effectiveness of honeypots since they help disrupt the entire spamming operation.

While this description is written in terms of open relay honeypots the advantage may be with open proxy honeypots: these days an open proxy honeypot is more likely to receive a direct connection from a spammer (making it possible to track the abuse to its source.) The design of at least one kind of open proxy honeypot is simple: intercept all port 25 traffic for other IPs and direct it to an SMTP honeypot (integral to the open proxy honeypot or external - whatever works.) Other designs are possible. The important considerations are, as with open relay honeypots, that the honeypot deceive the spammer and that the honeypot not deliver spam.

The prime countermeasure a spammer can take is to stop sending spam to the honeypot, once he discovers it is a honeypot. In other words, the spammer doesn't send spam. That's the overall goal, for all destination IPs (0.0.0.0/0) - the spammer is doing the right thing if he stops sending spam. The real issue would appear to be that of how easily the spammer can discover the honeypot. If the goal is to make discovery of true open relays too difficult for the spammer to tolerate then if honeypots are used they have to be difficult to detect, have to look very much like true open relays.

An isolated honeypot should be easily detected, if the spammer tries. He need only send some spam addressed to his own dropbox and use the fact of non-delivery to establish that an IP is a honeypot (other detection schemes are possible.) Creating a situation in which it is necessary for the spammer to detect honeypots already has made open relay detection more expensive to the spammer, more difficult. The goal is to increase the difficulty until the spammer gives up. The original deception that got the spammer to send spam was to deliver his test message. The same philosophy holds for later test messages, including spam addressed to the spammer's own dropbox: if you deliver them the spammer is fooled. (In the past some spammers have sent their standard test message simultaneously with a spam run to discover if the relay remains open. That's trivially easy to handle.) If the spammer sends spam to his own address then he probably will use that same address multiple times. The honeypot counter-countermeasure may be to deposit all spammed addresses in a central database, shared by a consortium of honeypot operators. If the spammer uses a test address with any frequency that address will receive proportionately more spam than do the ordinary addresses. Once a test address is identified honeypots still working can simply deliver any spam that comes for that address, fooling the spammer.

Lower-level honeypots can also be run - ones that only trap relay tests. These, too, will be detectable by the spammer. If the operator believes his honeypot has been detected by the spammer and if he has another IP number available he can simply change the IP number and continue. Eventually the spammer will discover that IP, and so on. The goal would be for the spammer to abandon consideration of the entire network segment as having potential abusable open relays. Note that if the spammer regards acceptance of a test without subsequent delivery of the test as evidence of a honeypot it is very beneficial for operators of free email services used by the spammers for their dropboxes to divert email messages to those dropboxes but to leave the accounts active. This looks the same to the spammer as does a failure of the tested IP to deliver. The result could be that the spammer will conclude that an actual open relay is a honeypot because the test message was accepted but the spammer never received it (just as for the honeypot that accepts but doesn't deliver.) The goal, always, is to disrupt the spammers' ability to test accurately.

                                    -o-

Where do honeypots fit in the draft taxonomy (<http://www1.ietf.org/mail-archive/working-groups/asrg/current/msg01794.html>)?

In 1. a), as v) Intercept spam

In 1. a), as vi)  Destroy ability to find open relays (open proxies)

In 1. b), under ii, Tracking.

In 2. as c)  If it's trapped by a honeypot and isn't a relay test: it's spam.

In 3) h) Feedback. Honeypots may discover large numbers of open proxy IPS, may discover spammer IPs.

In 3) k) Teergrubing is possible with a honeypot: Jackpot implements it. Some hackback potential exists: the honeypot may be the spammer's first contact point in the spam chain. The ethics of causing an open proxy crash by hackback could be debated - there's some justification for that action.

In 3), as l) Raucous laughter. You can become very amused at a spammer sending copious amounts of spam into a black hole. It's even more amusing if the spammer uses some address "trick," to exploit an address-form vulnerability.

Drat, I forgot. Part of simulating an open relay is simulating a bumbling operator. A honeypot might simulate rejection of standard email addresses but simulate vulnerability to one of the other forms of address spammers sometimes use. Similarly, a honeypot can have periods of down time and periods in which it responds with a "disk full" message. I've sometimes totally blocked a spam-source IP: a real bumbling operator might do that. The spammer need only change IP to get back in. Golly, he got me again. :-)

Where do honeypots fit in the list of requirements (<http://www1.ietf.org/mail-archive/working-groups/asrg/current/msg01721.html>)?

1. They do reduce the level of unwanted messages. This would more or less be a linear function of the number of active and successful honeypots. 2. They have nearly zero effect on all valid messages (only extremely unlikely situations lead to any disruption at all of valid email messages.). The honeypot might contend for bandwidth with a local server. That should be the full extent of the interference, unless some hotshot rogue blocklist somehow learns the honeypot IP and does an expanded listing around it. No such blocklist operator is known to exist. 3. The honeypot can be easy to use. For Jackpot you download Jackpot, download a JVM, and run Jackpot. You can configure Jackpot permanently, in the jackpot.properties file or you can make configuration changes using Jackpot's built-in web interface. The web interface also allows examination of trapped tests and spam. 4. The honeypot may be easy to deploy. Jackpot is easy, as described. A sendmail honeypot requires some special configuration. That's easy, in sendmail terms. Other honeypots could be implemented to be easy. Honeypots could be distributed as an optional package to use with anti-virus software or as an optional package with hardware or software firewalls. Honeypots could be made part of the standard distribution of operating systems (e.g., Linux.) 5. Honeypots do not depend on universal deployment to be effective - many of their features exist and have power at the single-implementation level. To disable the ability to find open relays they will have to be widely deployed. It will probably help tremendously if there are many versions and flavors of honeypot. Much of the power of honeypots depends on spammers continuing to search for open relays. If spammers decide to stop searching for open relays and use just the ones they know then the war will become one of attrition, as existing open relays are secured (with the best method of securing often being conversion to a honeypot.) 6. Privacy. The operator of a honeypot that is successfully trapping spam learns the email addresses of the intended victims. 7. Administration and implementation overhead. More than one level of honeypot can be deployed. Simple honeypots have minimal requirements. (If honeypots become plentiful then a honeypot that simply accepted relay email messages and discarded them would be effective. The spammers have no way of knowing which of such systems are he ones that lead to complaints to the spammers' ISPs. They have to worry just as much about such a low-function relay test message trap as they do about a honeypot run by the most active anti-spammer. ) Activates based on data collected by the honeypot may take time. Jackpot honeypots can report their activity to a central web server, sendmail honeypots could be very simple installations, not honeypots at all, and simply smarthost all email to a central master honeypot. (That may carry a somewhat greater risk of spammer discovery but is possiblle.) 8. Bandwidth is whatever the spammer consumes, as limited by the honeypot or by other means. Computational overhead is negligible - it's about like a mail server. 9. Robustness. Some forms of honeypot are detectable by easy means. The response is to stop testing that IP for being an open relay. 10 Legal issues. I'd probably die laughing if a spammer sued me for not letting him steal my service to send his spam. If used at all to trigger law enforcement there might be an entrapment defense. Just hooking a system to the internet is hard to see as being entrapment.

Who can use these ideas to fight spam?

Anyone who has contact with a resource the spammer uses or abuses in his activity. The spammer's ISP can detect testing for open relays and take appropriate measures. The ISP of the systems being tested can take appropriate measures. The operator of tested systems can take appropriate measures. The operator of a freemail service used by the spammer for his dropbox can take appropriate measures.


Conclusion: This could be a better and more formal proposal but as it is it should serve to provide material for consideration and discussion - that's the main goal. There is a long history of people finding fault with honeypots simultaneously with real honeypots operating and having significant effect. The proposal is more about fighting spam by fighting the abuse committed by spammers to send spam. The spammers are vulnerable to such a defense, such a defense, if pressed vigorously, should greatly affect the ability of spammers to continue operating. That's the goal. Honeypots are already working - their number could be increased at any time to increase their overall impact (there's no significant delay connected with their adoption - no RFC process that need be followed.) Spam is a huge problem today, honeypots work today. They need to receive careful consideration as a major component of the battle against spam.



_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg