Re: Suggested changes to address Cullen's DISCUSS on draft-ietf-sieve-30

Ned, this is a very helpful email on moving things forward - thanks.If' you get to the bottom of this email, I would be fine with a veryminor variant of what you are suggesting and I think it meets the usecases you brought and god to discuss changes if it does not resolveothers that were not brought up.


On Aug 25, 2007, at 3:59 PM, Ned Freed wrote:

inline ...
On Aug 25, 2007, at 9:37 AM, Alexey Melnikov wrote:
> Cullen Jennings wrote:
>
> > I don't think that is really implementable advice - it basically
> > wishes the problem away.
What advice are you talking about here? I see nothing in therevised text thatwould argue for the sort of analysis of sieve scripts you're (quitecorrectly)
saying is impossible.


Uh  .. the text that say
   It is equally important that implementations sanity-check the user's
   scripts, and not allow users to create on-demand mailbombs.  For
   instance, an implementation that allows a user to redirect a message

multiple times might also allow a user to create a mailbombtriggered

   by mail from a specific user.

but ignore this - lets move ton to how to fix instead of why Ithought the draft suggested something impossible

> It is implementable, it applies to UIs and admin tools and many
> implementations already follow it.
So the topic I was questioning the implementability of is a program
that can look a series of sieve scripts across one or more email
servers and decide if the sieve scripts are capable of creating large
scale message amplification attacks.
Sounds to me like you and Alexey are talking at cross purposeshere, but inregards to the question of whether what you're talking about isimplementable:
It isn't, although not for the reasons you give.
The main reason it is unimplementable is very simple: Email is anInternet-wideservice and no program can possibly have the administative accessto the entireInternet it would need to perform such a check. As a specificexample, I haveemail accounts at Sun, on my home systems, Gmail, and a bunch ofother places.
All, I repeat _all_, of these services allow me to easily configure
autoforwarding to multiple destinations.

Uh - you sure. I'm not aware how to make gmail go to multiple systemwhich is of course the important one because the others systems haverelationships with you and other ways to secure things by causing badconsequences for you if you abuse the system. But again, let's get tohow to fix it which I think you and I are on roughly the same page.

I will also point out that while
several of these systems provide Sieve support, none of themrequire the use ofsieve for me to set up this sort of autoforwarding. (OK, to be fairI have noidea what's underneath the hood at Gmail, but if the other aspectsof their
filtering interface are any indication it is not Sieve-based.)
To put this more concisely: It is clearly impossible to devine theintent of anautoforwarder that is operating as part of a much larger web ofautoforwardersjust by looking at it in isolation, and it is equally impossible toassess theproperties of other autoforwarders located in other administrativedomains. Itfollows that analyzing autoforwarders to prevent construction ofmail bombs isinherently unimplementable. (Believe it or not, some faciities in X.400actually depended on this being possible. You can probably guesshow well that
worked out...)


Yes agree with all your analysis here/

The draft clearly does not
contain enough information to tell someone how to implement this and
personally I seriously doubt that it is possible due to the halting
problem.
I believe my preceeding argument demonstrates that the question ofSieve's
Turing completeness is completely irrelevant to the matter at hand.
Nevertheless, since several people apparently continue to thinkthis is an
important point I will elaborate on it a little further.
A Sieve in isolation is pretty clearly not Turing complete - noreal loops - so
the at that level at least the halting problem doesn't enter into it.
Now, Eric Rescorla once argued that Sieve should be consideredTuring completebecause you could bounce a message back and forth between multiplesystems,
using the message content to store the position on the tape.
Even if you ignore the critical point that Sieve not being Turingcomplete wasalways about whether or not a sieve script in isolation could causea infiniteloop and wedge up your server and never about whether or not Sievecouldfunction as a component in a larger, Turing complete system, theproblem withthis argument is that Recieved: field counting nails you prettyquickly forregular messages, and once again you're left without the loopconstruct you
need to make it "work".
This then leaves you with no alternative but to try and use other,wierdersorts of loops. The obvious one to try, and the one Eric proposed,is what Icall a "bounce loop": Redirect to a known-invalid address, get thebounce back,redirect again, bounce again, etc. But bounce loops are onlypossible if someagent in the path is willing to change the empty envelope from thatis requiredin the bounce message to refer to some other address in theforwarding path.Without that happening a bounce cannot itself generate a bounce, sothe loop
stops after at most two iterations.
This last point actually raises a general issue for autoforwarders- they MUSTNOT override an empty envelope from because if they do they cancreate bounceloops. (I note in passing that in practice this is almost alwayshappensaccidentally, not intentionally. Nevertheless, it is a very realproblem thatemail systems have to deal with.) RFC 2821 should have made this arequirementbut didn't. I plan to raise this issue in the context of 2821bisbut I wouldnot object to changing the closing text in section 4.2 of the Sievebase
specification to say:

  The envelope sender address on the outgoing message is chosen by the
  sieve implementation. It MAY be copied from the message being
  processed. However, if the message being processed has an empty
  envelope sender address the outgoing message MUST also have an
  empty envelope sender address.

More generally, one of the primary goals of the design of the email
infrastructure is that _all_ autoforwarding loops, once underway,can bedetected and stopped. It is _not_ a design goal to prevent suchloops from
being set up - any system that tried to enforce that would be much too
restrictive.
If follows that the ability to construct an undetectable loop meansthere's aserius design or the implementation flaw that needs to becorrected. But thefact that such flaws might enable Sieve to be Turing complete isentirely
trivial compared to the other serious issues undetectable loops raise.
Getting back to the script complexity issue, since Sieve allowsarbitraryboolean expressions script analysis is pretty clearly equivalent tothesatisfiability problem, i.e. it's NP-complete, and that puts fullanalysis ofarbitrary sieves out of reach of anything short of a quantumcomputer, whichAFAIK means it effectively cannot be done. So that's another reasonwhy you'recorrect in saying this kind of analysis is impossible to perform onSievescripts. But since AFAICT nobody is proposing performing such ananalysis,
this doesn't change anything.
Of course I suspect it is not possible because I am also
very incredulous of the WG's claim that sieve coupled with common
email systems is not turing complete.
To be blunt, your incredulity fails to impress, let alone persuade.If youbelieve such a thing is possible then prove it by giving thedetails of howsuch a system could be built that doesn't depend on anarchitectural flaw of
some sort in email's design or implementation.
I'm actually quite eager to be proved wrong on this since it willnecessarilyexpand my knowledge of what email systems are capable of. But Irequire proof,
not the handwaving I've seen so far.

Once again, I have only said I suspect it is turning complete notthat I have proof of such. Anyway, we both agree this is irrelevantto resolving the problem.

Luckily I think that an
acceptable solution can be found without having a debate about basic
CS theory.


I'm delighted to hear it.

> > More inline ...
> > On Aug 17, 2007, at 2:34 PM, Alexey Melnikov wrote:
> >
> >> Cullen, does the following address your DISCUSS?
> >>
> >> =============================
> >> In section 4.2, last paragraph:
> >>
> >> OLD:
> >>  Implementations SHOULD take measures to implement loop control,
> >>  possibly including adding headers to the message or counting
> Received

> >> headers. If an implementation detects a loop, it causes anerror.

> >> NEW:
> >>  Implementations SHOULD take measures to implement loop control,
> >>  possibly including adding headers to the message or counting
> Received
> >>  headers as specified in section 6.2 of [SMTP].  If an
> >> implementation detects a loop, it causes an error.
> >>          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >>
> >> Add to the end of section 4.2 two new paragraphs:
> >>
> >>  Implementations SHOULD provide means of limiting the number of
> >> redirects a
> >>  Sieve script can perform.
> >

> > Well if it was the number of redirects SHOULD be limited toone it

> > would do it but as it is - not sure I see how it helps.
>

> Cullen, I told you before, there is very strong WG consensus toallow

> implementations to use more than one.
>
> Please see

> <http://www.imc.org/ietf-mta-filters/mail-archive/msg03601.html>and

> <http://www.imc.org/ietf-mta-filters/mail-archive/msg03599.html>.
>

> Now, if you have another suggestion how to address your issue, Iwould

> like to hear it.

I have given a simple and easy way to address it that seemed to also
work for the use cases that people told me about (I certainly admit I
don't know all the case).

If memory serves, your suggested way to prevent it was to allow atmost one

redirect operation to be done during the execution of a sieve script.

Well, what I have in my notes from the prague meeting was "Told themI was OK with SHOULD not redirect to more than one location outsidethe domain". We may have different ideas about what domain means butroughly I would mean administrative domain - certainly I would callan enterprise or large bank one domain even thought they might haveseparate DNS domain names for different locations or something. Youmight use the term "onsite" where I used domain.

And
you've been told that this is simply not acceptable - there are toomany use
cases for forwarding email to two different places at once.

Actually, I have received very little comment on if this isacceptable or not. I have mostly recieved something that looselytranslated to "the WG disagrees with your discuss". I would be veryhappy to talk what my concerns are and have folks figure out if theiris a solution that addresses them and still meets the bulk of usersneeds.

In fact this is an
incredibly common thing for people to do for all sorts oflegitimate reasons,
including but not limited to:

I suspect what I had proposed meets all these use case but don'tclaim to really know. Glad to be educated.

(1) Making messages accessible in multiple ways, e.g. send one copy to
my email account and another to my pager. Or voice mail. Orprinter.
   Or FAX.
(2) Secretaries who need copies of their boss's email.
(3) Maintaining multiple mailboxes during a service transitionperiod. Or
   two pagers.
(4) Making backup copies of mail to address reliability issues (IMOthis isnot the best way to solve this problem but it is commonly donethis way
   regardless of what I think).
(5) Making copies of stuff for law enforcement use (garden variety
forwarding is a piss-poor way to implement this but it isnevertheless
   something people do fairly often).

(uh, yah, laughing about how well this will meet some US Lawenforcement requirements :-)

(6) Vacation-time forwarding of important messages to the "nexttier" of
   handlers.
We have two large banks as customers, both with thousands ofemployees. TheAmerican one is very well known - anyone who isn't a completehermit isguaranteed to be familiar with it. They have a very elaborate emailsetup thatroutes mail through various analysis, categorization, andcompliance facilites.The outcome of these operations are then assessed with varioussieve scripts,which then use redirect and various other tools to alter messagerouting. Theability to redirect to multiple addresses plays an important rolein all this.
The other bank is European (and may be a household name there forall I know)and their setup is even more relevant. They have basically usedSieve toimplement an elaborate workflow system. Some of their scripts useredirect10-20 times on a single message. The most interesting aspect oftheir setup isthat they have intentionally created loops all over the place. Theyuse variouschecks, some explicitly done in Sieve, some autogenerated in Sieve,and othersexternal to Sieve, to break the loops at exactly the right points.This usageis way past what, say, .forward files can do. The properfunctioning of thissetup is so important to them that no less than the CEO of the bankhas beenknown to "drop in" on technical discussions of variousimplementation details.(And this is not hearsay - I'm speaking from direct personalexperience.)

I could go on and give many more examples of use cases buthopefully you'vegotten the idea by now. The ability to autoforward to multipleaddresses is anessential feature - it has part of modern email effectively fromthe beginningand without it Sieve simply doesn't have feature parity with othermessagefiltering facilities. In our case if we were to remove it ourcustomers wouldsimply find a different vendor that doesn't impose such limits. Asfor settingthe default limit on redirect to 1, all that would accomplish forus is to keep
us all busy dealing with the resulting support calls.

I'm not surprised - this all seems very consistent with mydiscussions to these types of folks. I assume you also have thediscussions with the people like Yahoo, MSN, gmail, around thepolicies they put in place to stop their servers from being used inDOS attacks? I know I have been dragged into a few theses - can youprovide some insight around policies there?

In understand that different things are used in different times - thegoal of this document should be to produce something where theSecurity considerations are good enough not to cause serious harm.

However, let me be very clear - it is not
my job to find a solution that the WG likes to this.
It is, however, your job to impose reasonable requirements. And IMOyou are not
being reasonable here.

The solution would be discussion around understanding a set ofreasonable requirements and feasible designs.

I believe that
the IETF has clear consensus that it does not want to deploy
technology that could trivially be used for large scale DOS and
message amplification attacks with very little safeguards or
traceable ways of dealing with this.


This presupposes that there's a real risk here and that the necessary
safeguards are not in place. I don't believe either of these are true.


So far no one as sent me any information arguing these are not true.

The fact is Sieve and numerous other autoforwarding mechanisms arealreadywidely deployed without the limits you think are necessary andcuriously all of
these serious flaws you see aren't being exploited.
I dislike trotting out our own deployment statistics but it seemsnow is thetime for it. According various sources our product provides servicefor wellover 100 million mailboxes. A significant fraction of this totalprovides enduser access to set up sieves and our current default is to allow upto 32redirects per sieve. As I pointed out in my original response, thenumber ofcustomers we've had that have reported problems with sieve redirectand haveneeded to impose limits is exactly zero. And believe me when I saythat ourcustomer base includes lots of people who aren't exactly shy aboutreporting
the least little thing that goes wrong.
Now my current judgement call is
that this is in that category and could cause harm to the internet.
But we're not dealing with a new protocol with zero operationalexperience herewhere such judgement calls are our only means of assessing possiblerisk.Rather, we're not only dealing with a protocol that has signifcantdeploymentand operational experience, we're talking about a particularfeature that'sbeen an integrall part of email for decades and is accessible inall sorts of
ways besides Sieve.

Sure - and clearly if it is not a problem, it is stopped somehow, I'masking the Security section to provide some advice about how this isall stopped.

You can convince me I am wrong about that - or you could find a way
of mitigating and reducing this risk - I can think of several and I
have suggested the one that I think is most likely to be acceptable
to the WG
Given that limiting redirects to 1 is totally unacceptable I haveno idea
what you are referring to here.
but I make no presumption that I would know what is the
best solution for this WG on this problem. I have made it clear what
the issues is, why I think a change is needed, this should not behard.
I don't think it should be hard either, but the talking at crosspurposes
continues.

So let me try one last time to cut through the misunderstandings.
First, nobody is claiming that autoforwarders cannot be used as anamplifyingcomponent in various sorts of attacks. They can be used this way.This was truefor email decades before Sieve came along and it will continue tobe true no
matter what we do in this document.

yet somehow widespread message amplification attacks from email aremitigated to an currently acceptable level.

Second, nobody is claiming that script analysis can be used toprevent people
from abusing Sieve as part of such attacks. It simply cannot be done.

agreed - let's make sure the draft does not suggest people do it

So, since such attacks are inherently possible in the present email
infrastructure no matter what we do or don't do in Sieve, the focushas to be
on detecting and stopping attacks. This is done by:
(1) Making sure infinite loops can be detected and stopped. Themain risk isthat without comprehensive loop detection you can and will endup with
   exponentially growing message loops. I've seen such situations
arise and create literally 10s of millions of messages in a veryshort
   period of time.
Without a true loop the amplification potential of anautoforwarder web isbounded by the number of autoforwarders in the web multipied bynumber ofredirects that are allowed per sieve. So the only way such a webcan be usedeffectively is if you have the means to inject an endless streamof messages
   into it at some point. And that brings us to:
(2) Rate limit message submission. If true loops aren't possiblesomethingsomewhere has to act to generate the "signal" that is"amplified". This
   sort of behavior can be detected and blocked.
(3) Administrative controls. The basic rule is don't allow users toaccessfunctionality they don't need. If there's never a need to useredirect,disable it entirely. If they only need to be able to do oneredirectper sieve, only allow that. If they only need to be able toredirect toother users onsite, only allow that. If a class of messagesexists thataren't supposed to be redirected, check for them and yell if aredirect
   is done. And so on. The list of possibilities here is very long.
(4) Auditing and tracking. The aforementioned European bank has arequirementthat every message carry a complete history of the redirectionthat wasperformed. They also require that all of this be derivable fromthe logs.
   And many of our other customers have similar requirements.
More generally, the days where email systems can get awaywithout producingcomprehensive logs and audit trails ended a long time ago. Ihave no ideawhat you're basing your assertions that such attacks would notbe tracablein modern email systems, but it runs absolutely contrary to allof my
   recent experience.

Certainly agree with you in enterprise case but in a public providersuch as yahoo, sure they have logs but they may not have any way totying that to an actually human associated with the account.

Now, I will admit that the document doesn't cover a lot of this.But there's areason for that: Autofording to multiple address exists independentof Sieveand as such is a general architectural issue for email, not onespecific toSieve. It would be great if there was a document comparable to RFC3834 forautoforwarders but there isn't, and it is not within scope for thisWG to
produce such a specification.

I really liked the above analysis - it's the most information I havereceived on this thread since I first read this document. Thank you.

So what can we do? Well, I guess one option would be to try and make
the discussion of this a little more explicit in the document. Howabout
adding something like this to the security considerations?


This is great - few bits inline below...

Allowing a single script to redirect to multiple destinationscan be
   used as a means of amplifying the number of messages in an attack.
Moreover, if loop detection is not properly implemented it maybe possible
   to set up expontentially growing message loops. Acording, Sieve
   implementations:
(1) MUST implement facilities to detect and break message loops.See
       RFC 2821 section 6.2 for additional information on basic loop
       detection strategies.

My discuss did not ask for this but it certainly seems like a good idea.

(2) MUST provide the means for administrators to limit theability of
       users to abuse redirect. In particular, it MUST be possible to
limit the number of redirects a script can perform.Additionally,if no use cases exists for using redirect to to multipledestinations,
       this limit SHOULD be set to 1. Additional limits, such
       as the ability to restrict redirect to local users MAY also be
       implemented.

We are very close here. If you changed the "Additionally if no usecases exists for using redirect to to multiple destinations thislimit SHOULD be set to 1." to "Scripts SHOULD be limited to at mostone redirect that is not 'onsite'". And define onsite somewhere. I'dpoint out that this is a SHOULD not a MUST and that it could beignored if people understood the security implications of what theywere about to do. This could also possibly be coupled with idea afternext point to explicitly allow multiple redirects in some cases.

(3) MUST provide facilities to log use of redirect in order tofacilitate
       tracking down abuse.

I would ask if this mean that they MUST know what human the scriptwas associated with. If yes, it is way beyond what I am asking forand if no then hard to see how it helps. When I mentioned months agoI could imagine many ways to solve this, coupling this type of thingto the ability to do more than one redirect was one of the thingsthat seemed like a possible solution (and corresponds to myunderstanding of at least some current deployments).

I didn't include rate limiting on the list for several reasons: (1)It's hardto get right and naive attempts to implement it can be verydangerous. (2)There's no consensus on what best practices for it are and henceany discussion
of it is likely to rathole.

agree

I would also suggest getting rid of the discussion about sanity-checking sincethis discussion seems to indicate it is open to seriousmisinterpretation.

agree

Does any of this work for you? If it doesn't I'm frankly out ofideas for how
to resolve this.

This seems extremely close to fully resolving it. I think yoursuggestions are great and glad to have a dialog that suggestssomething to solve the issue.

Ned

Re: Suggested changes to address Cullen's DISCUSS on draft-ietf-sieve-3028bis-12.txt