My apologies for being only an occasional participant in this thread (and
it will likely take me another week before I can reply again), but there
are a few points I would like to make.
On Sat, Mar 30, 2019 at 02:17:55AM +0000, Bart Butler wrote:
Hi Jon,
As others have noted, there is a lot of confusion on this thread, some of
which you touched in your AEAD Conundrum message, like when we say AEAD
should not release unauthenticated plaintext, do we mean the entire message
or the chunk?
It's really quite something to have gone through a week's worth all in one
go. There are many people writing out careful descriptions of how they see
things, and yet we still seem to be talking past each other at times.
I propose that we use "plaintext corresponding to non-modified ciphertex"
for the non-malleability protection that is provided by an AEAD
authentication tag on a single chunk, and "fully authenticated complete
plaintext" for the output after processing an entire message (i.e., all
chunks) with guarantee of non-truncation. (Are there other cases in
between that we care about?)
Another piece of confusion is that Efail isn't a single vulnerability, it was
several vulnerabilities related (at best) thematically.
So to be very specific, for the purpose of the following discussion, the
advantage of smaller AEAD chunks is specifically to prevent Efail-style
ciphertext malleability/gadget attacks, and the prohibition on releasing
unauthenticated plaintext is applied to individual chunks, which is
sufficient to foil this kind of attack in email.
The kind of attack we are talking about is fundamentally about exfiltration
of plaintext data to an attacker-controlled endpoint. Borrowing from your
AEAD Conundrum message, if the first chunk passes and is released, and the
second chunk fails, that is OK, at least for email, because the part that was
modified (the second chunk) is never released, so you get a truncated message
and an error, but the truncated message without the modifications isn't going
to exfiltrate itself.
One concern that I have (and is only tangentially related to this quoted
part) is that I want to make it easy for implementations to "do the right
thing" when ciphertext is modified, i.e., return an error, and specifically
to return an error without releasing any plaintext that originates from the
modified ciphertext. The current openpgp ecosystem does not seem to be
very compliant to that desired behavior, and part of that may be due to a
lack of philosophical support/help from the spec.
Now if releasing ANY authenticated chunk of a message that hasn't been fully
authenticated (in an AEAD sense) is a real problem for your application, I'd
argue that you're trying to make AEAD do something it's not suited for and
you should enforce this in your application if it applies to you, probably by
not streaming.
So to recap, small-chunk AEAD provides specific value in preventing
ciphertext malleability/gadget attacks, particularly in HTML email, which is
a common use case.
What value does large-chunk AEAD actually provide? What I'm getting from the
AEAD Conundrum message is that it's a way for the message encrypter to
leverage the "don't release unauthenticated chunks" prohibition to force the
decrypter to decrypt the whole message before releasing anything. Why do we
want to give the message creator this kind of power? Why should the message
creator be given the choice to force her recipient to either decrypt the
entire message before release or be less safe than she would have been with
smaller chunks?
Coming back to Neal's point, it's really hard to see any sort of value in
really large AEAD chunks, because the performance overhead is negligible at
that point and the only security 'benefit' that I can see is the encrypter
trying to use the spec to force the decrypter to not stream, which does not
seem like something at all desirable.
I'm still not sure I understand the point of very large chunks, since once
they get really big an implementation is choosing between streaming
plaintext from potentially modified ciphertext or return an error without
even attempting to process the chunk. I'm not convinced that the second
will win out in implementations if we alow very large chunks.
Some other notes, not relating to anything specifically quoted from this
message (but derived from other parts of the thread):
TLS allows for arbitrarily variable-length chunks because it is
a synchronous transport for higher-level application streams and the
application may have arbitrary message sizes. OpenPGP is used in an
asynchronous model, where a message generator can be modelled to make all
its actions before the receiver processes anything, and there is only
one-directional communication within the OpenPGP format. So there does not
seem to be much demand for "take all the bytes that you have so far and
send them right now", and AFAICT the message generator can just wait until
end of data arrives or enough data to make a complete chunk arrives. So
from that point of view, there is not much argument in favor of varying
the chunk size within a single message, and possibly even across messages
(i.e., this line of reasoning would be okay with a single chunk size fixed
for everyone as a protocol constant). There are of course other factors
that may come into play, like constrained systems and such, but we can
treat those separately.
I also have a use case for authentication of large chunks of data at rest:
they allow me to use a cheap bulk storage service that provides
(best-effort) replication and archiving but has poor physical security. So
I encrypt my data to myself and put it in storage, but when I get it back
I need to know that it's valid. I can imagine at least one case where
knowing exactly which chunk was corrupted would save effort; it may be a
toy example but perhaps it is illustrative of a broader case. Note that
there are algorithms to compute pi to arbitrary precision, and even to
compute the Nth digit thereof without coputing the previous digits. If I
need to have random-access inquiries into the value of pi, I could
precompute using softare I trust and do this self-encryption thing, and
when a chunk is bad I can recompute only that chunk and still trust that I
only ever use values generated by my trusted implementation.
And finally, there is no openpgp Working Group; all we have here is a bunch
of folks interested in a topic talking amongst each other on a public
mailing list hosted at the IETF. There are no WG chairs and no expectation
of Area Director supervision (i.e., I don't feel obligated to read the
messages here). That said, I'm happy to see that we're staying calm and
civil, and AFAICT everyone is honestly trying to understand everyone else's
position and come to a consensus. Let's try to keep focusing on the
technical details and what use cases we need to cover.
Thanks,
Ben
_______________________________________________
openpgp mailing list
openpgp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/openpgp