ietf-mta-filters
[Top] [All Lists]

Re: Suggested changes to address Cullen's DISCUSS on draft-ietf-sieve-3028bis-12.txt

2007-08-29 14:22:40


Ned, this is a very helpful email on moving things forward - thanks. If' you get to the bottom of this email, I would be fine with a very minor variant of what you are suggesting and I think it meets the use cases you brought and god to discuss changes if it does not resolve others that were not brought up.

On Aug 25, 2007, at 3:59 PM, Ned Freed wrote:

inline ...

On Aug 25, 2007, at 9:37 AM, Alexey Melnikov wrote:

> Cullen Jennings wrote:
>
> > I don't think that is really implementable advice - it basically
> > wishes the problem away.

What advice are you talking about here? I see nothing in the revised text that would argue for the sort of analysis of sieve scripts you're (quite correctly)
saying is impossible.

Uh  .. the text that say
   It is equally important that implementations sanity-check the user's
   scripts, and not allow users to create on-demand mailbombs.  For
   instance, an implementation that allows a user to redirect a message
multiple times might also allow a user to create a mailbomb triggered
   by mail from a specific user.

but ignore this - lets move ton to how to fix instead of why I thought the draft suggested something impossible


> It is implementable, it applies to UIs and admin tools and many
> implementations already follow it.

So the topic I was questioning the implementability of is a program
that can look a series of sieve scripts across one or more email
servers and decide if the sieve scripts are capable of creating large
scale message amplification attacks.

Sounds to me like you and Alexey are talking at cross purposes here, but in regards to the question of whether what you're talking about is implementable:
It isn't, although not for the reasons you give.

The main reason it is unimplementable is very simple: Email is an Internet-wide service and no program can possibly have the administative access to the entire Internet it would need to perform such a check. As a specific example, I have email accounts at Sun, on my home systems, Gmail, and a bunch of other places.
All, I repeat _all_, of these services allow me to easily configure
autoforwarding to multiple destinations.
Uh - you sure. I'm not aware how to make gmail go to multiple system which is of course the important one because the others systems have relationships with you and other ways to secure things by causing bad consequences for you if you abuse the system. But again, let's get to how to fix it which I think you and I are on roughly the same page.

I will also point out that while
several of these systems provide Sieve support, none of them require the use of sieve for me to set up this sort of autoforwarding. (OK, to be fair I have no idea what's underneath the hood at Gmail, but if the other aspects of their
filtering interface are any indication it is not Sieve-based.)

To put this more concisely: It is clearly impossible to devine the intent of an autoforwarder that is operating as part of a much larger web of autoforwarders just by looking at it in isolation, and it is equally impossible to assess the properties of other autoforwarders located in other administrative domains. It follows that analyzing autoforwarders to prevent construction of mail bombs is inherently unimplementable. (Believe it or not, some faciities in X. 400 actually depended on this being possible. You can probably guess how well that
worked out...)

Yes agree with all your analysis here/


The draft clearly does not
contain enough information to tell someone how to implement this and
personally I seriously doubt that it is possible due to the halting
problem.

I believe my preceeding argument demonstrates that the question of Sieve's
Turing completeness is completely irrelevant to the matter at hand.
Nevertheless, since several people apparently continue to think this is an
important point I will elaborate on it a little further.

A Sieve in isolation is pretty clearly not Turing complete - no real loops - so
the at that level at least the halting problem doesn't enter into it.

Now, Eric Rescorla once argued that Sieve should be considered Turing complete because you could bounce a message back and forth between multiple systems,
using the message content to store the position on the tape.

Even if you ignore the critical point that Sieve not being Turing complete was always about whether or not a sieve script in isolation could cause a infinite loop and wedge up your server and never about whether or not Sieve could function as a component in a larger, Turing complete system, the problem with this argument is that Recieved: field counting nails you pretty quickly for regular messages, and once again you're left without the loop construct you
need to make it "work".

This then leaves you with no alternative but to try and use other, wierder sorts of loops. The obvious one to try, and the one Eric proposed, is what I call a "bounce loop": Redirect to a known-invalid address, get the bounce back, redirect again, bounce again, etc. But bounce loops are only possible if some agent in the path is willing to change the empty envelope from that is required in the bounce message to refer to some other address in the forwarding path. Without that happening a bounce cannot itself generate a bounce, so the loop
stops after at most two iterations.

This last point actually raises a general issue for autoforwarders - they MUST NOT override an empty envelope from because if they do they can create bounce loops. (I note in passing that in practice this is almost always happens accidentally, not intentionally. Nevertheless, it is a very real problem that email systems have to deal with.) RFC 2821 should have made this a requirement but didn't. I plan to raise this issue in the context of 2821bis but I would not object to changing the closing text in section 4.2 of the Sieve base
specification to say:

  The envelope sender address on the outgoing message is chosen by the
  sieve implementation. It MAY be copied from the message being
  processed. However, if the message being processed has an empty
  envelope sender address the outgoing message MUST also have an
  empty envelope sender address.

More generally, one of the primary goals of the design of the email
infrastructure is that _all_ autoforwarding loops, once underway, can be detected and stopped. It is _not_ a design goal to prevent such loops from
being set up - any system that tried to enforce that would be much too
restrictive.

If follows that the ability to construct an undetectable loop means there's a serius design or the implementation flaw that needs to be corrected. But the fact that such flaws might enable Sieve to be Turing complete is entirely
trivial compared to the other serious issues undetectable loops raise.

Getting back to the script complexity issue, since Sieve allows arbitrary boolean expressions script analysis is pretty clearly equivalent to the satisfiability problem, i.e. it's NP-complete, and that puts full analysis of arbitrary sieves out of reach of anything short of a quantum computer, which AFAIK means it effectively cannot be done. So that's another reason why you're correct in saying this kind of analysis is impossible to perform on Sieve scripts. But since AFAICT nobody is proposing performing such an analysis,
this doesn't change anything.

Of course I suspect it is not possible because I am also
very incredulous of the WG's claim that sieve coupled with common
email systems is not turing complete.

To be blunt, your incredulity fails to impress, let alone persuade. If you believe such a thing is possible then prove it by giving the details of how such a system could be built that doesn't depend on an architectural flaw of
some sort in email's design or implementation.

I'm actually quite eager to be proved wrong on this since it will necessarily expand my knowledge of what email systems are capable of. But I require proof,
not the handwaving I've seen so far.

Once again, I have only said I suspect it is turning complete not that I have proof of such. Anyway, we both agree this is irrelevant to resolving the problem.


Luckily I think that an
acceptable solution can be found without having a debate about basic
CS theory.

I'm delighted to hear it.

> > More inline ...
> > On Aug 17, 2007, at 2:34 PM, Alexey Melnikov wrote:
> >
> >> Cullen, does the following address your DISCUSS?
> >>
> >> =============================
> >> In section 4.2, last paragraph:
> >>
> >> OLD:
> >>  Implementations SHOULD take measures to implement loop control,
> >>  possibly including adding headers to the message or counting
> Received
> >> headers. If an implementation detects a loop, it causes an error.
> >> NEW:
> >>  Implementations SHOULD take measures to implement loop control,
> >>  possibly including adding headers to the message or counting
> Received
> >>  headers as specified in section 6.2 of [SMTP].  If an
> >> implementation detects a loop, it causes an error.
> >>          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >>
> >> Add to the end of section 4.2 two new paragraphs:
> >>
> >>  Implementations SHOULD provide means of limiting the number of
> >> redirects a
> >>  Sieve script can perform.
> >
> > Well if it was the number of redirects SHOULD be limited to one it
> > would do it but as it is - not sure I see how it helps.
>
> Cullen, I told you before, there is very strong WG consensus to allow
> implementations to use more than one.
>
> Please see
> <http://www.imc.org/ietf-mta-filters/mail-archive/msg03601.html> and
> <http://www.imc.org/ietf-mta-filters/mail-archive/msg03599.html>.
>
> Now, if you have another suggestion how to address your issue, I would
> like to hear it.

I have given a simple and easy way to address it that seemed to also
work for the use cases that people told me about (I certainly admit I
don't know all the case).

If memory serves, your suggested way to prevent it was to allow at most one
redirect operation to be done during the execution of a sieve script.

Well, what I have in my notes from the prague meeting was "Told them I was OK with SHOULD not redirect to more than one location outside the domain". We may have different ideas about what domain means but roughly I would mean administrative domain - certainly I would call an enterprise or large bank one domain even thought they might have separate DNS domain names for different locations or something. You might use the term "onsite" where I used domain.


And
you've been told that this is simply not acceptable - there are too many use
cases for forwarding email to two different places at once.

Actually, I have received very little comment on if this is acceptable or not. I have mostly recieved something that loosely translated to "the WG disagrees with your discuss". I would be very happy to talk what my concerns are and have folks figure out if their is a solution that addresses them and still meets the bulk of users needs.

In fact this is an
incredibly common thing for people to do for all sorts of legitimate reasons,
including but not limited to:

I suspect what I had proposed meets all these use case but don't claim to really know. Glad to be educated.


(1) Making messages accessible in multiple ways, e.g. send one copy to
my email account and another to my pager. Or voice mail. Or printer.
   Or FAX.
(2) Secretaries who need copies of their boss's email.
(3) Maintaining multiple mailboxes during a service transition period. Or
   two pagers.
(4) Making backup copies of mail to address reliability issues (IMO this is not the best way to solve this problem but it is commonly done this way
   regardless of what I think).
(5) Making copies of stuff for law enforcement use (garden variety
forwarding is a piss-poor way to implement this but it is nevertheless
   something people do fairly often).

(uh, yah, laughing about how well this will meet some US Law enforcement requirements :-)

(6) Vacation-time forwarding of important messages to the "next tier" of
   handlers.

We have two large banks as customers, both with thousands of employees. The American one is very well known - anyone who isn't a complete hermit is guaranteed to be familiar with it. They have a very elaborate email setup that routes mail through various analysis, categorization, and compliance facilites. The outcome of these operations are then assessed with various sieve scripts, which then use redirect and various other tools to alter message routing. The ability to redirect to multiple addresses plays an important role in all this.

The other bank is European (and may be a household name there for all I know) and their setup is even more relevant. They have basically used Sieve to implement an elaborate workflow system. Some of their scripts use redirect 10-20 times on a single message. The most interesting aspect of their setup is that they have intentionally created loops all over the place. They use various checks, some explicitly done in Sieve, some autogenerated in Sieve, and others external to Sieve, to break the loops at exactly the right points. This usage is way past what, say, .forward files can do. The proper functioning of this setup is so important to them that no less than the CEO of the bank has been known to "drop in" on technical discussions of various implementation details. (And this is not hearsay - I'm speaking from direct personal experience.)


I could go on and give many more examples of use cases but hopefully you've gotten the idea by now. The ability to autoforward to multiple addresses is an essential feature - it has part of modern email effectively from the beginning and without it Sieve simply doesn't have feature parity with other message filtering facilities. In our case if we were to remove it our customers would simply find a different vendor that doesn't impose such limits. As for setting the default limit on redirect to 1, all that would accomplish for us is to keep
us all busy dealing with the resulting support calls.

I'm not surprised - this all seems very consistent with my discussions to these types of folks. I assume you also have the discussions with the people like Yahoo, MSN, gmail, around the policies they put in place to stop their servers from being used in DOS attacks? I know I have been dragged into a few theses - can you provide some insight around policies there?

In understand that different things are used in different times - the goal of this document should be to produce something where the Security considerations are good enough not to cause serious harm.


However, let me be very clear - it is not
my job to find a solution that the WG likes to this.

It is, however, your job to impose reasonable requirements. And IMO you are not
being reasonable here.

The solution would be discussion around understanding a set of reasonable requirements and feasible designs.


I believe that
the IETF has clear consensus that it does not want to deploy
technology that could trivially be used for large scale DOS and
message amplification attacks with very little safeguards or
traceable ways of dealing with this.

This presupposes that there's a real risk here and that the necessary
safeguards are not in place. I don't believe either of these are true.

So far no one as sent me any information arguing these are not true.


The fact is Sieve and numerous other autoforwarding mechanisms are already widely deployed without the limits you think are necessary and curiously all of
these serious flaws you see aren't being exploited.

I dislike trotting out our own deployment statistics but it seems now is the time for it. According various sources our product provides service for well over 100 million mailboxes. A significant fraction of this total provides end user access to set up sieves and our current default is to allow up to 32 redirects per sieve. As I pointed out in my original response, the number of customers we've had that have reported problems with sieve redirect and have needed to impose limits is exactly zero. And believe me when I say that our customer base includes lots of people who aren't exactly shy about reporting
the least little thing that goes wrong.

Now my current judgement call is
that this is in that category and could cause harm to the internet.

But we're not dealing with a new protocol with zero operational experience here where such judgement calls are our only means of assessing possible risk. Rather, we're not only dealing with a protocol that has signifcant deployment and operational experience, we're talking about a particular feature that's been an integrall part of email for decades and is accessible in all sorts of
ways besides Sieve.
Sure - and clearly if it is not a problem, it is stopped somehow, I'm asking the Security section to provide some advice about how this is all stopped.


You can convince me I am wrong about that - or you could find a way
of mitigating and reducing this risk - I can think of several and I
have suggested the one that I think is most likely to be acceptable
to the WG

Given that limiting redirects to 1 is totally unacceptable I have no idea
what you are referring to here.

but I make no presumption that I would know what is the
best solution for this WG on this problem. I have made it clear what
the issues is, why I think a change is needed, this should not be hard.

I don't think it should be hard either, but the talking at cross purposes
continues.

So let me try one last time to cut through the misunderstandings.

First, nobody is claiming that autoforwarders cannot be used as an amplifying component in various sorts of attacks. They can be used this way. This was true for email decades before Sieve came along and it will continue to be true no
matter what we do in this document.
yet somehow widespread message amplification attacks from email are mitigated to an currently acceptable level.


Second, nobody is claiming that script analysis can be used to prevent people
from abusing Sieve as part of such attacks. It simply cannot be done.
agreed - let's make sure the draft does not suggest people do it



So, since such attacks are inherently possible in the present email
infrastructure no matter what we do or don't do in Sieve, the focus has to be
on detecting and stopping attacks. This is done by:

(1) Making sure infinite loops can be detected and stopped. The main risk is that without comprehensive loop detection you can and will end up with
   exponentially growing message loops. I've seen such situations
arise and create literally 10s of millions of messages in a very short
   period of time.

Without a true loop the amplification potential of an autoforwarder web is bounded by the number of autoforwarders in the web multipied by number of redirects that are allowed per sieve. So the only way such a web can be used effectively is if you have the means to inject an endless stream of messages
   into it at some point. And that brings us to:

(2) Rate limit message submission. If true loops aren't possible something somewhere has to act to generate the "signal" that is "amplified". This
   sort of behavior can be detected and blocked.

(3) Administrative controls. The basic rule is don't allow users to access functionality they don't need. If there's never a need to use redirect, disable it entirely. If they only need to be able to do one redirect per sieve, only allow that. If they only need to be able to redirect to other users onsite, only allow that. If a class of messages exists that aren't supposed to be redirected, check for them and yell if a redirect
   is done. And so on. The list of possibilities here is very long.

(4) Auditing and tracking. The aforementioned European bank has a requirement that every message carry a complete history of the redirection that was performed. They also require that all of this be derivable from the logs.
   And many of our other customers have similar requirements.

More generally, the days where email systems can get away without producing comprehensive logs and audit trails ended a long time ago. I have no idea what you're basing your assertions that such attacks would not be tracable in modern email systems, but it runs absolutely contrary to all of my
   recent experience.

Certainly agree with you in enterprise case but in a public provider such as yahoo, sure they have logs but they may not have any way to tying that to an actually human associated with the account.


Now, I will admit that the document doesn't cover a lot of this. But there's a reason for that: Autofording to multiple address exists independent of Sieve and as such is a general architectural issue for email, not one specific to Sieve. It would be great if there was a document comparable to RFC 3834 for autoforwarders but there isn't, and it is not within scope for this WG to
produce such a specification.


I really liked the above analysis - it's the most information I have received on this thread since I first read this document. Thank you.

So what can we do? Well, I guess one option would be to try and make
the discussion of this a little more explicit in the document. How about
adding something like this to the security considerations?

This is great - few bits inline below...


Allowing a single script to redirect to multiple destinations can be
   used as a means of amplifying the number of messages in an attack.
Moreover, if loop detection is not properly implemented it may be possible
   to set up expontentially growing message loops. Acording, Sieve
   implementations:

(1) MUST implement facilities to detect and break message loops. See
       RFC 2821 section 6.2 for additional information on basic loop
       detection strategies.
My discuss did not ask for this but it certainly seems like a good idea.


(2) MUST provide the means for administrators to limit the ability of
       users to abuse redirect. In particular, it MUST be possible to
limit the number of redirects a script can perform. Additionally, if no use cases exists for using redirect to to multiple destinations,
       this limit SHOULD be set to 1. Additional limits, such
       as the ability to restrict redirect to local users MAY also be
       implemented.
We are very close here. If you changed the "Additionally if no use cases exists for using redirect to to multiple destinations this limit SHOULD be set to 1." to "Scripts SHOULD be limited to at most one redirect that is not 'onsite'". And define onsite somewhere. I'd point out that this is a SHOULD not a MUST and that it could be ignored if people understood the security implications of what they were about to do. This could also possibly be coupled with idea after next point to explicitly allow multiple redirects in some cases.


(3) MUST provide facilities to log use of redirect in order to facilitate
       tracking down abuse.
I would ask if this mean that they MUST know what human the script was associated with. If yes, it is way beyond what I am asking for and if no then hard to see how it helps. When I mentioned months ago I could imagine many ways to solve this, coupling this type of thing to the ability to do more than one redirect was one of the things that seemed like a possible solution (and corresponds to my understanding of at least some current deployments).


I didn't include rate limiting on the list for several reasons: (1) It's hard to get right and naive attempts to implement it can be very dangerous. (2) There's no consensus on what best practices for it are and hence any discussion
of it is likely to rathole.
agree


I would also suggest getting rid of the discussion about sanity- checking since this discussion seems to indicate it is open to serious misinterpretation.
agree


Does any of this work for you? If it doesn't I'm frankly out of ideas for how
to resolve this.
This seems extremely close to fully resolving it. I think your suggestions are great and glad to have a dialog that suggests something to solve the issue.


                                Ned