[Asrg] Brad Templeton's C/R Guidelines

Here is a list of C/R guidelines compiled by Brad Templeton, who wrote oneof the early C/R systems (fromhttp://www.templetons.com/brad/spam/challengeresponse.html). To summarize:


o Never challenge any mail that's a reply to a private message you sent.
o Avoid challenging replies to public messages
o Use multiple addresses
o Never challenge mailing list mail
o Never challenge a challenge!
o Make the "From" on your challenge match the address mailed to
o Put an In-reply-to header on your challenge
o Include the subject of the original message in the challenge
o Present a regular summary of all blocked mail
o Make the challenge as easy as you can make workable.
o Don't force users to re-send mail
o Detect all attempts to subscribe to mailing lists
o Detect mailing lists subscribed to in the user's mail archives
o Detect patterns of possible incoming mailing lists
o Think about anonymous E-mail


-----------snip-------------
Proper principles for Challenge/Response anti-spam systems

Back in 1997 I wrote what is probably the first of the challenge/response(C/R) spam-blocking systems. These are systems that, when they see anE-mail from somebody you've never corresponded with before, hold the mailand e-mail back a "challenge" to confirm that the person is a real senderand not a mailing robot, in particular a spammer. The other person gets thechallenge, and responds to it in some way. If they do this properly, yoursystem releases the mail that was held, and from then on they can mailwithout challenge.

There are a number of these systems springing up -- it's a very effectivesystem and a fairly obvious idea -- but not everybody is doing it right, soI thought I would lay out some "best practices" based on my 6 years ofexperience. I don't even do all of these things, because I wrote my systembefore they became necessary, but if I were writing a new version, I would.

Never challenge any mail that's a reply to a private message you sent.

If you send somebody private mail (from any address you have), and theyreply to you with any mailer, you should accept their mail and not sendthem a challenge. This is true even if they reply from a different addressthan you sent the mail to. Many people have mail aliases, and receive mailon one address and send on another. Some people use other anti-spam systemsthat generate new addresses every time they mail.

What this means is that simply whitelisting all addresses you mail to isnot enough, though it is of course an important thing to do.

One of the easiest ways to do this, by the way, is to have multipleaddresses yourself. Send out private mail with an address that does not dochallenges. An old fashioned unfiltered E-mail box (though you may want tonote the addresses on incoming mail to whitelist them.) However, you mustbe sure this address won't get out to spammers, or you will have to switchit to another. (You must be prepared to do this.)

In general, you should probably put an un-challenged address on businesscards too. Save filtered addresses for public use. Postings to mailinglists, listings on web pages, listings in conference directories, etc.

Avoid challenging replies to public messages

If you can do it, avoid challenging replies to your public messages tomailing lists and newsgroups. With private mailing lists (not archived inpublic) you can of course accept any replies with reasonable safety basedon subject line and in-reply-to. With public postings, consider acceptingreplies unchallenged for a few days to weeks after postings, then add achallenge for late replies which are more likely to be spammers.

Use multiple addresses

Any good spam filtering system will support giving the user multiplealiases under which to receive mail. This has two functions. One, you canfilter some aliaes more than others. For example, you might have "public"addresses used in newsgroup postings and on a web site, and privateaddresses used only in mails to private parties, replies etc. You would useless filtering, and perhaps no challenge/response, on private addresses.

It's also handy to provide a gamut of addresses to use so that you can usea different address every time you give out an address. For example, ifentering data on a web page that asks for your E-mail address, use adifferent one each time. That way if any address gets on spammer's lists,you can delete it or give it very high spam filtering with minimal risk tomail from others.

The best plan is to have your own subdomain for mail, allowing an infinitespace of addresses. However, if that is not available to you, sendmailtreats mail to "userid+anything" as mail to the given userid. For example,if a sendmail user has the address fbaggins(_at_)shire(_dot_)org, thenfbaggins+ring(_at_)shire(_dot_)org and fbaggins+bagend(_at_)shire(_dot_)org and all other suchaddresses will be delivered to the main address. Qmail does a similarsystem, using a dash instead of a plus. That's better, since unfortunatelythere are huge numbers of badly coded web forms that, because they map "+"to a space, don't accept fully legal e-mail addresses with a plus in them.

The personal domain is also best because spammers can easily guess the rootaddress on a plus-sign based address. If you use this, you must have filterthe base address, and have unfiltered addresses use the plus.

Some systems generate a new address for every mail sent, using a specialrandom string in the address itself. Some use a cryptographically securehash to generate the string so they can immediately identify any addressthey identified without having to remember them.

Be aware, however, that in generating many addresses, you may mesh badlywith other whitelist systems expecting your mail to come from the sameaddress. One option is to use the base address in the "From:" and put anygenerated address, especially an unfiltered one, in the Reply-to. Bewarethat there are mailers that botch Reply-to out there.

Never challenge mailing list mail

For decades, all good mail responders have known not to respond to mailinglist mail. An unofficial standard has indicated that bulk mail of variousforms would have a header like "Precedence: bulk" or sometimes "Precedence:list" to mark it as bulk. "Precedence: junk" is rarely used for it woulddeclare things to be spam!

You can also test to see that none of the addresses in the "To" and "CC"lines is an address for the person getting the mail, though that doespresent a maintenance problem since there is no automatic way to know allthose addresses. However, you definitely should not challenge any mail withthe above precedence headers.

Who to challenge?

There are three possible addresses you can challenge. They are the"Envelope From", the regular "From:" and the address in a "Reply-to:" header.

Most merit points to challenging the Envelope From, which is the addressyou would send bounce errors to. The "From:" is the person who wrote themessage (and thus in most cases, though not all, the person you are tryingto confirm is a human being.) The "Reply-to" is the address that the senderexpected actual replies to the mail to go.


Unfortunately, you definitely should not challenge more than one of these.

A challenge is similar to a bounce error, but unfortuantely in many casesit is not handled by a human -- it was in fact designed to be not handledby a human. Most such cases are list mail, which you should not bechallenging at all. In the case of list mail, the Envelope-From alwaysidentifies the list manager itself, not the particular poster to themailing list. Sometimes it is a unique address, so that programs canautomate detecting bounces without having to parse them to try to figureout what mail bounced.

The From is often the actual person who posted to the mailing list, or thereal sender of a person to person mail. Some lists have all mail come"from" the list manager, however. Some lists have the list address be inthe Reply-to.

You must not challenge individual mailers to a list, so only challenge theFrom or Reply-to when you are sure it is not list mail. If you challengeindividual mailers you'll get bounced of the list very quickly.

The answer here thus depends on how good your detection of list mail is. Ifit's reliable, you may decide to challenge the From or Reply-to, since thatis more assured to be a human. On the other hand, challenging theEnvelope-From has many merits. The worst case is that it's not a human (orit's a list that is not tagging list mail as such) and this mail willappear in the digest, hopefully near the top.

Never challenge a challenge!

The other person might have a C/R program or a whitelist.
Make the "From" on your challenge match the address mailed to

When they send out their mail they will have whitelisted the address theysent to, so any challenge From that address should get through.

Put an In-reply-to header on your challenge

The challenge should refer to the message-id of the mail being challenged.A good whitelist program should remember the message-id of every mail theuser sends out, and every challenge sent out. If a challenge comes backwith an in-reply-to, you can identify it as a valid challenge. In the end,this may become the main technique, once spammers try to guess the names ofyour friends and send spam disguised as challenges. They can't fake thismessage-id.

The other reason to record the outgoing message-id is to be sure you neverchallenge anybody replying to mail you sent out. If mail has an in-reply-tothat matches an outgoing message-id of a private mail of yours, you let it in.

Include the subject of the original message in the challenge

C/R programs should also log outgoing subjects, so that they can detectreplies (and challenges) to the user's messages.

Present a regular summary of all blocked mail

No system is perfect, so the system must present a summary on somereasonable interval, of mail that was blocked by the system. This wouldinclude mailing list mail that was unchallenged, and mail to which thechallenge was never responded.

This should be presented as a summary digest, which allows a quick scan ofall these messages. The summary should show a minimal set of relevantheaders (From, To, Subject, CC etc.) and a few lines from the body. Itshould also show a "spam score" calculated for the message, and the digestshould be sorted by spam-score, so the lowest scores appear at the top.

With each message in the digest, the user should be able to select themessage to define what to do with it, including delivering it, whitelistingthe sender, whitelisting the mailing list it came from, and combinations.It can also offer options like blacklisting the sender, tuning thespam-score, and reporting the spam to collaborative filters.

Any existing spam scoring system can be used. The fact that the challengedaddress did not exist or the mail to it bounced may give a high spam-score,but one should be wary of the affect of this on anonymous mail.

The summary can be e-mailed every so often (once a day typically, or lessfrequently for people who read mail less frequently) or a web option shouldbe available to see the latest summary. Normally messages would not appearin the summary until they have had some period of time to get a response tothe challenge -- typically a daily digest will have the prior day'smessages in it.

This step is vital. If this is not done, users will miss mail for mailinglists they joined, mail from people who decide not to answer challenges,and mail from people whose mail software is incompatible with the challenge.

Understand mail/postings to public vs. private addresses

As noted, the best practice is to use an address that does not have C/R onmail to private parties. It is important however to use a C/R filteredaddress if the mail/posting will go out in public. This includes allnewsgroup postings, and any mail to mailing lists which have publicarchives. An ideal system would modify outgoing mail, using a non-filteredaddress on private mail, and a public address on mail that may be exposedin public.

Make the challenge as easy as you can make workable.

Spammers are not currently trying to rake responses to spam challenges, butthey will. Until they do, asking for any reply at all actually works wellas a challenge. Once they do, challenges must require some special actionfrom the responder, something to prove they are human. Even so, try to makeit as easy as possible, and provide several means of responding to thechallenge.

For example, send your challenge as a multipart/alternative with plain textand HTML. In both, include a link the user can click on to make theirresponse via a browser. However, since many people read mail offline orwithout a browser handy, always allow the response to come in E-mail.

Don't require the user to be online to see the challenge, ie. don't useinlined image files unless absolutely necessary.

While the challenge must come "From:" the address that was mailed, it canhave a Reply-to that sends the response to a specific handler with a uniqueaddress that lets you know what challenge is being answered. Since someusers will not deal properly with the Reply-to, it is advised you alsodetect responses at the address which was in the From: of the challenge. Inyour challenge, put a magic token in the Subject line, Message-id and body,and if that token appears in any part of the response -- Subject,In-Reply-To or body, you will be able to identify the response, no matterwhat address it comes from.

If you ask the user to answer a question, be as forgiving as possible ifinding it in the body or subject of the response. If the user makes a badresponse, give them an error to know their mail is not yet delivered.

Don't force users to re-send mail

Some challenges indicate the original mail was not delivered, and ask theuser to send it again. Users will balk at this, and if they felt they weredoing the recipient a favour (such as answering a question they asked in apublic forum) they often will not bother to jump through any hoops torespond to challenges or re-send mail. You must make it as easy as possible.

Detect all attempts to subscribe to mailing lists

Watch outgoing mail and look for any attempts by the user to subscribe to amailing list. This includes mail to "-subscribe" or "-request" addressesespecially with "subscribe" in the subject or at the start of a line in thebody. Try to understand the subscribe requests of most major mailing listsystems, such as majordomo, listserv, topica, yahoo egroups, etc.

When the user subscribes to a list, you need to identify the list andwhitelist it.

You can subscribe to lists via the web, though many then do a 2ndconfirmation of the subscribe -- usually also by web -- which you may beable to look for. You must also avoid challenging these confirmations, eventhough they will not come with a Precedence bulk. In some cases users mayhave to avoid signing up for lists via the web without telling the C/R system.

Detect mailing lists subscribed to in the user's mail archives

Most C/R systems do a pre-scan of the user's archived mail folders,outgoing and incoming, as well as address books, to whitelist all propercorrespondents in advance. Detect the presence of mailing lists in thesearchives to whitelist them in advance. You can't challenge mailing listmail so this is important. You will need to extract the Envelope From, asopposed to the "From:" header, in many cases, to properly spot mailinglists. Of course, you must avoid scanning spam to avoid whitelisting it.

Detect patterns of possible incoming mailing lists

Fortunately most spammers don't actually maintain real mailing lists thatsend multiple mailings to a user with the same Envelope From, and theydon't use Precedence headers. You should, however, look for patterns inthese headers on incoming list mail. (List mail to be identified byPrecedence header and lack of the user's address in To/Cc headers.)

For example, if you get a sudden surge of messages, all with the sameEnvelope-From for the target user, this may be a mailing list the user hassubscribed to. This is especially true if the messages have low spam scores.

In this event, consider placing a special note at the top of the digestsummary, or in a special message, saying something like, "You have recentlyreceived 6 mailing list messages from a list identifying itself as XYZ" andprovide a means to say they wish to whitelist the list or perhaps blacklistit. If they whitelist it, deliver the mail. Give them a way to examine thepotential list mail.

This is needed because you won't catch every mailing list subscription theydo. Especially since in many cases you can subscribe to lists via the web.

Be warned however, that some mailing list managers put magic tokens in theenvelope-from, to more easily track bounces. However many popular listmanagers also put in special "list" headers that help you identify thelist. This includes headers like List-ID, and a "Sender" header.

Think about anonymous E-mail

Anonymous E-mail is still a useful thing. In part, you allow it byproviding the daily digest of mail that was unresponded, with low spamscores coming first. Of course two-way remailers let you send a challengeand get a response by E-mail. If you insist on response by web you make ita little harder. Offering both lets the anonymous mailer select the bestway to protect her identity.

Other systems (e-stamps etc.) which may not work on their own can haveapplication to allow anon mailers to get through C/R systems.

Spammers may try to fake the things you detect

Spammers will eventually try to fake out all things you look for in orderto avoid challenging or filtering e-mail. However, they will not do thisright away. Since all things you do that make it harder for mail to get inwill increase your risk of blocking desired mail, don't apply any strictertest until it actually becomes necessary.


Among the tests I have listed here, risks exist in the following areas.

Spammers will eventually try to guess what mailing lists you are on, orwhat correspondents you have whitelisted, and they will forge mail toappear like that. This is especially true with any publicly archivedmailing list you post to. Lists will eventually need digital signature ifthis attack becomes common.If you allow replies to your messages to come in based on subject, thenspammers will form replies to your public messages. To avoid this, you maywish to allow unchallenged replies only for a limited time on public messages.Try to be liberal at first, and only close down when spammers abuse theliberty. Don't try to prevent something that's not yet happening if it hasa risk of blocking legitimate mail.

C/R may, over time, lose its utility if most spammers try to target itdirectly. However, it still has several years of life. It can also becombined with other techniques. For example, if you have a good spamfilter, you might decide to challenge only messages with high spam scoresor other reasons to suspect they are spam, and let through other mail.

-----------snip-------------

_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg