Re: Suggested changes to address Cullen's DISCUSS on draft-ietf-sieve-3028bis-12.txt
2007-08-29 14:22:40
Ned, this is a very helpful email on moving things forward - thanks.
If' you get to the bottom of this email, I would be fine with a very
minor variant of what you are suggesting and I think it meets the use
cases you brought and god to discuss changes if it does not resolve
others that were not brought up.
On Aug 25, 2007, at 3:59 PM, Ned Freed wrote:
inline ...
On Aug 25, 2007, at 9:37 AM, Alexey Melnikov wrote:
> Cullen Jennings wrote:
>
> > I don't think that is really implementable advice - it basically
> > wishes the problem away.
What advice are you talking about here? I see nothing in the
revised text that
would argue for the sort of analysis of sieve scripts you're (quite
correctly)
saying is impossible.
Uh .. the text that say
It is equally important that implementations sanity-check the user's
scripts, and not allow users to create on-demand mailbombs. For
instance, an implementation that allows a user to redirect a message
multiple times might also allow a user to create a mailbomb
triggered
by mail from a specific user.
but ignore this - lets move ton to how to fix instead of why I
thought the draft suggested something impossible
> It is implementable, it applies to UIs and admin tools and many
> implementations already follow it.
So the topic I was questioning the implementability of is a program
that can look a series of sieve scripts across one or more email
servers and decide if the sieve scripts are capable of creating large
scale message amplification attacks.
Sounds to me like you and Alexey are talking at cross purposes
here, but in
regards to the question of whether what you're talking about is
implementable:
It isn't, although not for the reasons you give.
The main reason it is unimplementable is very simple: Email is an
Internet-wide
service and no program can possibly have the administative access
to the entire
Internet it would need to perform such a check. As a specific
example, I have
email accounts at Sun, on my home systems, Gmail, and a bunch of
other places.
All, I repeat _all_, of these services allow me to easily configure
autoforwarding to multiple destinations.
Uh - you sure. I'm not aware how to make gmail go to multiple system
which is of course the important one because the others systems have
relationships with you and other ways to secure things by causing bad
consequences for you if you abuse the system. But again, let's get to
how to fix it which I think you and I are on roughly the same page.
I will also point out that while
several of these systems provide Sieve support, none of them
require the use of
sieve for me to set up this sort of autoforwarding. (OK, to be fair
I have no
idea what's underneath the hood at Gmail, but if the other aspects
of their
filtering interface are any indication it is not Sieve-based.)
To put this more concisely: It is clearly impossible to devine the
intent of an
autoforwarder that is operating as part of a much larger web of
autoforwarders
just by looking at it in isolation, and it is equally impossible to
assess the
properties of other autoforwarders located in other administrative
domains. It
follows that analyzing autoforwarders to prevent construction of
mail bombs is
inherently unimplementable. (Believe it or not, some faciities in X.
400
actually depended on this being possible. You can probably guess
how well that
worked out...)
Yes agree with all your analysis here/
The draft clearly does not
contain enough information to tell someone how to implement this and
personally I seriously doubt that it is possible due to the halting
problem.
I believe my preceeding argument demonstrates that the question of
Sieve's
Turing completeness is completely irrelevant to the matter at hand.
Nevertheless, since several people apparently continue to think
this is an
important point I will elaborate on it a little further.
A Sieve in isolation is pretty clearly not Turing complete - no
real loops - so
the at that level at least the halting problem doesn't enter into it.
Now, Eric Rescorla once argued that Sieve should be considered
Turing complete
because you could bounce a message back and forth between multiple
systems,
using the message content to store the position on the tape.
Even if you ignore the critical point that Sieve not being Turing
complete was
always about whether or not a sieve script in isolation could cause
a infinite
loop and wedge up your server and never about whether or not Sieve
could
function as a component in a larger, Turing complete system, the
problem with
this argument is that Recieved: field counting nails you pretty
quickly for
regular messages, and once again you're left without the loop
construct you
need to make it "work".
This then leaves you with no alternative but to try and use other,
wierder
sorts of loops. The obvious one to try, and the one Eric proposed,
is what I
call a "bounce loop": Redirect to a known-invalid address, get the
bounce back,
redirect again, bounce again, etc. But bounce loops are only
possible if some
agent in the path is willing to change the empty envelope from that
is required
in the bounce message to refer to some other address in the
forwarding path.
Without that happening a bounce cannot itself generate a bounce, so
the loop
stops after at most two iterations.
This last point actually raises a general issue for autoforwarders
- they MUST
NOT override an empty envelope from because if they do they can
create bounce
loops. (I note in passing that in practice this is almost always
happens
accidentally, not intentionally. Nevertheless, it is a very real
problem that
email systems have to deal with.) RFC 2821 should have made this a
requirement
but didn't. I plan to raise this issue in the context of 2821bis
but I would
not object to changing the closing text in section 4.2 of the Sieve
base
specification to say:
The envelope sender address on the outgoing message is chosen by the
sieve implementation. It MAY be copied from the message being
processed. However, if the message being processed has an empty
envelope sender address the outgoing message MUST also have an
empty envelope sender address.
More generally, one of the primary goals of the design of the email
infrastructure is that _all_ autoforwarding loops, once underway,
can be
detected and stopped. It is _not_ a design goal to prevent such
loops from
being set up - any system that tried to enforce that would be much too
restrictive.
If follows that the ability to construct an undetectable loop means
there's a
serius design or the implementation flaw that needs to be
corrected. But the
fact that such flaws might enable Sieve to be Turing complete is
entirely
trivial compared to the other serious issues undetectable loops raise.
Getting back to the script complexity issue, since Sieve allows
arbitrary
boolean expressions script analysis is pretty clearly equivalent to
the
satisfiability problem, i.e. it's NP-complete, and that puts full
analysis of
arbitrary sieves out of reach of anything short of a quantum
computer, which
AFAIK means it effectively cannot be done. So that's another reason
why you're
correct in saying this kind of analysis is impossible to perform on
Sieve
scripts. But since AFAICT nobody is proposing performing such an
analysis,
this doesn't change anything.
Of course I suspect it is not possible because I am also
very incredulous of the WG's claim that sieve coupled with common
email systems is not turing complete.
To be blunt, your incredulity fails to impress, let alone persuade.
If you
believe such a thing is possible then prove it by giving the
details of how
such a system could be built that doesn't depend on an
architectural flaw of
some sort in email's design or implementation.
I'm actually quite eager to be proved wrong on this since it will
necessarily
expand my knowledge of what email systems are capable of. But I
require proof,
not the handwaving I've seen so far.
Once again, I have only said I suspect it is turning complete not
that I have proof of such. Anyway, we both agree this is irrelevant
to resolving the problem.
Luckily I think that an
acceptable solution can be found without having a debate about basic
CS theory.
I'm delighted to hear it.
> > More inline ...
> > On Aug 17, 2007, at 2:34 PM, Alexey Melnikov wrote:
> >
> >> Cullen, does the following address your DISCUSS?
> >>
> >> =============================
> >> In section 4.2, last paragraph:
> >>
> >> OLD:
> >> Implementations SHOULD take measures to implement loop control,
> >> possibly including adding headers to the message or counting
> Received
> >> headers. If an implementation detects a loop, it causes an
error.
> >> NEW:
> >> Implementations SHOULD take measures to implement loop control,
> >> possibly including adding headers to the message or counting
> Received
> >> headers as specified in section 6.2 of [SMTP]. If an
> >> implementation detects a loop, it causes an error.
> >> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >>
> >> Add to the end of section 4.2 two new paragraphs:
> >>
> >> Implementations SHOULD provide means of limiting the number of
> >> redirects a
> >> Sieve script can perform.
> >
> > Well if it was the number of redirects SHOULD be limited to
one it
> > would do it but as it is - not sure I see how it helps.
>
> Cullen, I told you before, there is very strong WG consensus to
allow
> implementations to use more than one.
>
> Please see
> <http://www.imc.org/ietf-mta-filters/mail-archive/msg03601.html>
and
> <http://www.imc.org/ietf-mta-filters/mail-archive/msg03599.html>.
>
> Now, if you have another suggestion how to address your issue, I
would
> like to hear it.
I have given a simple and easy way to address it that seemed to also
work for the use cases that people told me about (I certainly admit I
don't know all the case).
If memory serves, your suggested way to prevent it was to allow at
most one
redirect operation to be done during the execution of a sieve script.
Well, what I have in my notes from the prague meeting was "Told them
I was OK with SHOULD not redirect to more than one location outside
the domain". We may have different ideas about what domain means but
roughly I would mean administrative domain - certainly I would call
an enterprise or large bank one domain even thought they might have
separate DNS domain names for different locations or something. You
might use the term "onsite" where I used domain.
And
you've been told that this is simply not acceptable - there are too
many use
cases for forwarding email to two different places at once.
Actually, I have received very little comment on if this is
acceptable or not. I have mostly recieved something that loosely
translated to "the WG disagrees with your discuss". I would be very
happy to talk what my concerns are and have folks figure out if their
is a solution that addresses them and still meets the bulk of users
needs.
In fact this is an
incredibly common thing for people to do for all sorts of
legitimate reasons,
including but not limited to:
I suspect what I had proposed meets all these use case but don't
claim to really know. Glad to be educated.
(1) Making messages accessible in multiple ways, e.g. send one copy to
my email account and another to my pager. Or voice mail. Or
printer.
Or FAX.
(2) Secretaries who need copies of their boss's email.
(3) Maintaining multiple mailboxes during a service transition
period. Or
two pagers.
(4) Making backup copies of mail to address reliability issues (IMO
this is
not the best way to solve this problem but it is commonly done
this way
regardless of what I think).
(5) Making copies of stuff for law enforcement use (garden variety
forwarding is a piss-poor way to implement this but it is
nevertheless
something people do fairly often).
(uh, yah, laughing about how well this will meet some US Law
enforcement requirements :-)
(6) Vacation-time forwarding of important messages to the "next
tier" of
handlers.
We have two large banks as customers, both with thousands of
employees. The
American one is very well known - anyone who isn't a complete
hermit is
guaranteed to be familiar with it. They have a very elaborate email
setup that
routes mail through various analysis, categorization, and
compliance facilites.
The outcome of these operations are then assessed with various
sieve scripts,
which then use redirect and various other tools to alter message
routing. The
ability to redirect to multiple addresses plays an important role
in all this.
The other bank is European (and may be a household name there for
all I know)
and their setup is even more relevant. They have basically used
Sieve to
implement an elaborate workflow system. Some of their scripts use
redirect
10-20 times on a single message. The most interesting aspect of
their setup is
that they have intentionally created loops all over the place. They
use various
checks, some explicitly done in Sieve, some autogenerated in Sieve,
and others
external to Sieve, to break the loops at exactly the right points.
This usage
is way past what, say, .forward files can do. The proper
functioning of this
setup is so important to them that no less than the CEO of the bank
has been
known to "drop in" on technical discussions of various
implementation details.
(And this is not hearsay - I'm speaking from direct personal
experience.)
I could go on and give many more examples of use cases but
hopefully you've
gotten the idea by now. The ability to autoforward to multiple
addresses is an
essential feature - it has part of modern email effectively from
the beginning
and without it Sieve simply doesn't have feature parity with other
message
filtering facilities. In our case if we were to remove it our
customers would
simply find a different vendor that doesn't impose such limits. As
for setting
the default limit on redirect to 1, all that would accomplish for
us is to keep
us all busy dealing with the resulting support calls.
I'm not surprised - this all seems very consistent with my
discussions to these types of folks. I assume you also have the
discussions with the people like Yahoo, MSN, gmail, around the
policies they put in place to stop their servers from being used in
DOS attacks? I know I have been dragged into a few theses - can you
provide some insight around policies there?
In understand that different things are used in different times - the
goal of this document should be to produce something where the
Security considerations are good enough not to cause serious harm.
However, let me be very clear - it is not
my job to find a solution that the WG likes to this.
It is, however, your job to impose reasonable requirements. And IMO
you are not
being reasonable here.
The solution would be discussion around understanding a set of
reasonable requirements and feasible designs.
I believe that
the IETF has clear consensus that it does not want to deploy
technology that could trivially be used for large scale DOS and
message amplification attacks with very little safeguards or
traceable ways of dealing with this.
This presupposes that there's a real risk here and that the necessary
safeguards are not in place. I don't believe either of these are true.
So far no one as sent me any information arguing these are not true.
The fact is Sieve and numerous other autoforwarding mechanisms are
already
widely deployed without the limits you think are necessary and
curiously all of
these serious flaws you see aren't being exploited.
I dislike trotting out our own deployment statistics but it seems
now is the
time for it. According various sources our product provides service
for well
over 100 million mailboxes. A significant fraction of this total
provides end
user access to set up sieves and our current default is to allow up
to 32
redirects per sieve. As I pointed out in my original response, the
number of
customers we've had that have reported problems with sieve redirect
and have
needed to impose limits is exactly zero. And believe me when I say
that our
customer base includes lots of people who aren't exactly shy about
reporting
the least little thing that goes wrong.
Now my current judgement call is
that this is in that category and could cause harm to the internet.
But we're not dealing with a new protocol with zero operational
experience here
where such judgement calls are our only means of assessing possible
risk.
Rather, we're not only dealing with a protocol that has signifcant
deployment
and operational experience, we're talking about a particular
feature that's
been an integrall part of email for decades and is accessible in
all sorts of
ways besides Sieve.
Sure - and clearly if it is not a problem, it is stopped somehow, I'm
asking the Security section to provide some advice about how this is
all stopped.
You can convince me I am wrong about that - or you could find a way
of mitigating and reducing this risk - I can think of several and I
have suggested the one that I think is most likely to be acceptable
to the WG
Given that limiting redirects to 1 is totally unacceptable I have
no idea
what you are referring to here.
but I make no presumption that I would know what is the
best solution for this WG on this problem. I have made it clear what
the issues is, why I think a change is needed, this should not be
hard.
I don't think it should be hard either, but the talking at cross
purposes
continues.
So let me try one last time to cut through the misunderstandings.
First, nobody is claiming that autoforwarders cannot be used as an
amplifying
component in various sorts of attacks. They can be used this way.
This was true
for email decades before Sieve came along and it will continue to
be true no
matter what we do in this document.
yet somehow widespread message amplification attacks from email are
mitigated to an currently acceptable level.
Second, nobody is claiming that script analysis can be used to
prevent people
from abusing Sieve as part of such attacks. It simply cannot be done.
agreed - let's make sure the draft does not suggest people do it
So, since such attacks are inherently possible in the present email
infrastructure no matter what we do or don't do in Sieve, the focus
has to be
on detecting and stopping attacks. This is done by:
(1) Making sure infinite loops can be detected and stopped. The
main risk is
that without comprehensive loop detection you can and will end
up with
exponentially growing message loops. I've seen such situations
arise and create literally 10s of millions of messages in a very
short
period of time.
Without a true loop the amplification potential of an
autoforwarder web is
bounded by the number of autoforwarders in the web multipied by
number of
redirects that are allowed per sieve. So the only way such a web
can be used
effectively is if you have the means to inject an endless stream
of messages
into it at some point. And that brings us to:
(2) Rate limit message submission. If true loops aren't possible
something
somewhere has to act to generate the "signal" that is
"amplified". This
sort of behavior can be detected and blocked.
(3) Administrative controls. The basic rule is don't allow users to
access
functionality they don't need. If there's never a need to use
redirect,
disable it entirely. If they only need to be able to do one
redirect
per sieve, only allow that. If they only need to be able to
redirect to
other users onsite, only allow that. If a class of messages
exists that
aren't supposed to be redirected, check for them and yell if a
redirect
is done. And so on. The list of possibilities here is very long.
(4) Auditing and tracking. The aforementioned European bank has a
requirement
that every message carry a complete history of the redirection
that was
performed. They also require that all of this be derivable from
the logs.
And many of our other customers have similar requirements.
More generally, the days where email systems can get away
without producing
comprehensive logs and audit trails ended a long time ago. I
have no idea
what you're basing your assertions that such attacks would not
be tracable
in modern email systems, but it runs absolutely contrary to all
of my
recent experience.
Certainly agree with you in enterprise case but in a public provider
such as yahoo, sure they have logs but they may not have any way to
tying that to an actually human associated with the account.
Now, I will admit that the document doesn't cover a lot of this.
But there's a
reason for that: Autofording to multiple address exists independent
of Sieve
and as such is a general architectural issue for email, not one
specific to
Sieve. It would be great if there was a document comparable to RFC
3834 for
autoforwarders but there isn't, and it is not within scope for this
WG to
produce such a specification.
I really liked the above analysis - it's the most information I have
received on this thread since I first read this document. Thank you.
So what can we do? Well, I guess one option would be to try and make
the discussion of this a little more explicit in the document. How
about
adding something like this to the security considerations?
This is great - few bits inline below...
Allowing a single script to redirect to multiple destinations
can be
used as a means of amplifying the number of messages in an attack.
Moreover, if loop detection is not properly implemented it may
be possible
to set up expontentially growing message loops. Acording, Sieve
implementations:
(1) MUST implement facilities to detect and break message loops.
See
RFC 2821 section 6.2 for additional information on basic loop
detection strategies.
My discuss did not ask for this but it certainly seems like a good idea.
(2) MUST provide the means for administrators to limit the
ability of
users to abuse redirect. In particular, it MUST be possible to
limit the number of redirects a script can perform.
Additionally,
if no use cases exists for using redirect to to multiple
destinations,
this limit SHOULD be set to 1. Additional limits, such
as the ability to restrict redirect to local users MAY also be
implemented.
We are very close here. If you changed the "Additionally if no use
cases exists for using redirect to to multiple destinations this
limit SHOULD be set to 1." to "Scripts SHOULD be limited to at most
one redirect that is not 'onsite'". And define onsite somewhere. I'd
point out that this is a SHOULD not a MUST and that it could be
ignored if people understood the security implications of what they
were about to do. This could also possibly be coupled with idea after
next point to explicitly allow multiple redirects in some cases.
(3) MUST provide facilities to log use of redirect in order to
facilitate
tracking down abuse.
I would ask if this mean that they MUST know what human the script
was associated with. If yes, it is way beyond what I am asking for
and if no then hard to see how it helps. When I mentioned months ago
I could imagine many ways to solve this, coupling this type of thing
to the ability to do more than one redirect was one of the things
that seemed like a possible solution (and corresponds to my
understanding of at least some current deployments).
I didn't include rate limiting on the list for several reasons: (1)
It's hard
to get right and naive attempts to implement it can be very
dangerous. (2)
There's no consensus on what best practices for it are and hence
any discussion
of it is likely to rathole.
agree
I would also suggest getting rid of the discussion about sanity-
checking since
this discussion seems to indicate it is open to serious
misinterpretation.
agree
Does any of this work for you? If it doesn't I'm frankly out of
ideas for how
to resolve this.
This seems extremely close to fully resolving it. I think your
suggestions are great and glad to have a dialog that suggests
something to solve the issue.
Ned
|
|