Re: [openpgp] AEAD Chunk Size

Hi Neal,

On Mon, 2019-03-18 at 22:03 +0100, Neal H. Walfield wrote:

On Mon, 18 Mar 2019 20:51:32 +0100,
Tobias Mueller wrote:

On Mon, 2019-03-18 at 11:53 +0100, Neal H. Walfield wrote:

For me, a plaintext is authenticated if the whole ciphertext
could

be

successfully authenticated. Which seems to be very well in line
with

the

definition you've linked to.


4880bis defines a chunking mechanism based on AEAD: the message is
split into multiple chunks.  In 4880bis, AEAD operates on a per-
chunk
basis.  The chunking algorithm provides mechanisms for ensuring
chunks
can't be reordered, detecting the end of the message, etc.  Using
AEAD
to decrypt a chunk authenticates that chunk's ciphertext; for a
given
chunk, the decryption operation will either return the correct
plaintext, or it will return an error.  This is exactly what RFC
5116
requires.


I beg to differ. Because, as you mention:

 RFC 5116 doesn't discuss chunking; chunking is not AEAD.


Chunking is not AEAD. It's a protocol on top of AEAD messages that
you
have to come up with. And then you have to implement it correctly.
The
security guarantees that AEAD gives you, do not automatically apply
to
your chunking scheme.
As you've said: Chunking is not AEAD. Hence, it cannot automatically
be
in line with what RFC5116 demands.


The chunks use AEAD.  So 5116 applies to the chunks.  That means, for
a given chunk, either authenticated plaintext is returned or failure.
I don't see a contradiction.


Let's carefully revisit the RFC5116. It starts with:

   There is a single output:

      A ciphertext C, which is at least as long as the plaintext, or

      an indication that the requested encryption operation could not be
      performed.


Note that there is *a single output* rather than multiple and that it
doesn't allow for releasing partial plaintexts or authenticated
prefixes.
Do you see that any chunking protocol on top of that which is allowed
for releasing plaintext early is not immediately covered by this
definition?

Let the RFC be crystal clear:

    In particular, partially encrypted or
    partially decrypted data MUST NOT be returned.

This contradicts with handing out decrypted chunks, partial plaintexts,
authenticated prefixes, or whatever name you might want to give them, as
those are partial decryptions of the original message that was protected
in full with an AEAD scheme. If I meant to encode several individual
messages which have a right to be decrypted on their own, I might as
well have produced several individual messages.

You seem to think that AEAD's guarantees must apply to the whole
message.  I disagree.


I'm glad you're saying this.
And yes, I think that proper AE means that the full message enjoys
the
security guarantees of AE. Also because I am not aware of
definitions
covering partially authenticated plaintext.


TLS is used by a few billion people.  Let's just do what they do...

I don't fully understand how you're making the leap here from
definitions of AE to a transport layer protocol. But that's a very
interesting point.

First of all, be aware that TLS records use a variable size. That's much
unlike what you're proposing. 
Additionally, TLS is a transport layer protocol rather than a syntax
which people will use to archive their emails and backups, which OpenPGP
is to some.
Finally, TLS requires the application to be safe from truncation. From
the source you've referenced yourself <
https://crypto.stanford.edu/~dabo/cryptobook/BonehShoup_0_4.pdf> on the
"Cookie Cutter" attack in § 9.8:
"TLS assumes that the application layer will defend against this
attack".
I think you can make that assumption for applications using OpenPGP, but
if you do then stating that in the spec would be more honest about the
security OpenPGP will provide rather than silently taking the one-chunk
option away without warning the applications which intend to use the
protocol.

The TLS record protocol needed to be changed a few times to respond to
newly discovered threats.  I am too humble to assume that the design of
the OpenPGP chunking protocol has been gotten right on the first
attempt. By removing the option to not use the chunking protocol, you
prevent a safe usage of the protocol in the future.

Do you still think we should blindly "just do what they do"? What would
that even mean?

The whole thread, starting here,
https://mailarchive.ietf.org/arch/msg/cfrg/-lj3IEm9agpfgUOb2yX9JNrtrm8
is an interesting read.  And, I think it generally supports my
position.

I thought your position was that chunking a message and applying AE to
those automatically preserves the security properties which we have
proofs for.  The position in the thread, as far as I can see, is, that a
custom chunking protocol has unknown security properties.
I'm glad that this was a misunderstanding.

I further think getting as close to
proper AE as possible is a goal worth pursuing.


I think that if we accept your position, OpenPGP will have less
practical security.

AFAICS, there is no need to accept anything here. Right now, with
4880bis, it is possible to create a message with exactly one chunk. You
are proposing to remove that option.

Did you rather mean to say that "if we keep providing the option of
using proper AE for the messages then we will have less practical
security"?
And I would even agree to some extent.
But I'm not yet convinced that fully trusting the OpenPGP chunking
mechanism leads to a safer future than allowing clients to not rely on
the correct definition and implementation of that mechanism.


If you absolutely must stream, then there is no way that you can
buffer
the whole message, otherwise you wouldn't stream.  I claim, however,
that in the vast majority of use-cases you don't have the
requirement of
having to stream.  As in, purely from a functional perspective, not
from
an implementation perspective.  Hence, imposing the concept of
streaming
onto everybody somehow does not feel right.


Let's consider email, which is a common use case for OpenPGP.  I like
that K-9 just downloads the first few kilobytes of each message.  I
want K-9 to be able to continue to do that even when everyone is
encrypting their email.  For that, I need an authenticated prefix.

Just a quick reminder that nobody is about to take that option away.
The proposal at hand is to remove the option of not having to produce
several chunks.

Reg. Email, let me cite the Efail paper: 

   We think it is safe to turn off streaming in the email context
   because the size of email ciphertexts is limited and can be handled
   by modern computers.

Likewise, with a dropbox-like application, I'd like to be able to
preview the content.  Again, I need an authenticated prefix.

I don't object. The application could, probably based on the file type,
determine reasonably safe boundaries and encrypt those.
Assuming that the safe boundary for all files is the same seems
dangerous.

I want to be able to stream archives and backups.

I'm not so sure here. A  nc -l -p 1234 | unsafe_decrypt | tar -f-  can
have severe consequences if partial plaintexts are involved.  I can
understand that some users are enjoying the guarantees of AE, that is,
either to decrypt the whole message or nothing. With your proposal, the
user is at risk of either the chunking algorithm not being sound, the
algorithm not being implemented correctly, or the application (tar, in
this case) not being safe against recovering from a truncated message.
And the truncation is the best case, that is, we assume that the
implementation did everything right. We haven't yet considered a minor
flaw in the implementation that would allow, say, re-using nonces, re-
ordering of chunks, or changing the size of the chunks.
With your proposal, how would such a user construct an OpenPGP message
that is less risky?


Pretty much the only case that I can think of that chunking is not
useful is for verifying software updates.

I agree that chunking is useful more often than it's necessary.

I'd like to note, though, that it is possible to not reveal the
plaintext no matter how large the message is, though.  You can mask
the
output you release, e.g. XOR it or apply CTR mode, and provide the
key
to remove the mask only when the ciphertext has checked out
correctly.


I don't see how that helps with streaming.  Basically any further
processing needs to wait until the whole message has been decrypted a
second time...

Of course. Implementations are free to provide a different API. I merely
presented an example of how one could expose the functionality of the
spec.
Anyway, if you desire a property P and that property depends on all the
bits of the ciphertext, you need to wait until you have gotten around to
process all those bits. One could argue that this has been the semantic
ever since the MDC has been introduced. Now with AEAD we get a formal
description of that semantic.
And it helps an OpenPGP library to support streaming, because it can
safely release that memory to the caller or update the message in-place.

From the proposal you made it seem you think we should not even try
to
provide a format for a non-streaming message.  Would you describe
that
as correct?


We should provide exactly one variant.  Additional variants must
justify their existence, and I don't see the huge value add for
"proper AEAD".  In fact, it seems dangerous as I think it will
encourage decryption misuse.

Fair enough.
The value add is that we have proofs for the security properties of AEAD
encrypted messages. AFAICS, we don't have those for the custom chunking
protocol. I'll be happy to be told otherwise. Note that proven schemes
exist which have "intermediate tags", such as POET or COLM. Alas, none
of them is defined as an option for OpenPGP (yet?).
Another benefit is being able to not use the chunking once it turns out
that it's not good enough anymore.

So we know that using AE has security benefits, that libraries can
implement that safely, and that most use cases do not require messages
to be actually streamed or that they're better off defining partial
plaintexts for themselves. Given that applications can produce multiple
message if they desire, the one-chunk option is strictly better than the
multiple-chunk option.  Removing the option of using one chunk and
*forcing* every consumer to use streaming seems mis-guided.

I think that even if we add a bit that says: "don't stream",
implementations will ignore it.


Hm. I'd classify this as a wilful violation of the spec rather than
an
accident while implementing it.
Once you assume that implementations are doing things wilfully
wrongly,
it gets messy.
I mean... where do we stop making compromises in the security of the
spec because we believe someone will wilfully ignore the spec? We
rely
on the client not actively misinterpreting the spec. Like.. not
making
secret key material available.


It's a simple question of: "can I use the software to get my work
done"?  If the security is in the way, then the security will be
disabled.  "Proper AEAD" will be in the way often enough that it will
get disabled, and decrease the security of the whole system.

I get your point. You don't want the security properties of AE. That's
fair. And you assume that implementations will actively misinterpret the
spec. While this is a bit pessimistic it may indeed reflect real-life.
Do you think that defining a non-streaming mode and a streaming mode
would work?  Then clients wouldn't need to feel bad about releasing
plaintext early, but can use the streaming mode instead.


Cheers,
  Tobi

_______________________________________________
openpgp mailing list
openpgp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/openpgp