Re: [openpgp] AEAD Chunk Size

The main question here is: What should a conforming application look like?

The current behaviour of GnuPG is that it will process internally (e.g.,
through the decompression and signature verification layer) and output
externally unauthenticated plaintext.  If an AEAD chunk is modified by
an attacker, GnuPG will detect the modification and cancel the
operation, but only at the end of each chunk.  Due to the asynchronous
buffer management in GnuPG, quite often some part of the modified chunk
has then already been processed and output, depending on the particular
state of the buffers, the buffer size and the chunk size.  This
behaviour increases the surface for chosen ciphertext attacks and
possibly adaptive chosen plaintext attacks (if an oracle is exposed).

The criticism here is mainly targeted at the following reasoning: If the
AEAD chunk size is not bounded to reasonable length, any conforming
implementation will have to implement a mode of operation that is
similar to the one currently implemented in GnuPG.  As a consequence, to
decrease development cost, most implementations will _only_ have this
one implementation (as opposed to two implementations, one for large
chunks and one for reasonably short chunks).

This assumption is strengthened by the observation that GnuPG is
de-facto norm-setting, due to its popularity and the lack of
standardization activity.

Again: What should a conforming implementation look like?  The current
draft proposal will lead, I think, to a world where most conforming
OpenPGP implementations will never benefit from the non-malleability
properties of AEAD cipher modes.  A proposal restricting the chunk
length to a reasonable value will lead, I hope, to a world where most
OpenPGP implementations will benefit from the non-malleability of AEAD.

There are many other areas where the OpenPGP standard does not specify
reasonable lengths, but implementations do impose such restrictions.  In
my opinion, this is a mistake rather than a precedent to follow.

Thanks,
Marcus


On 3/28/19 10:27 PM, Jon Callas wrote:

On Mar 28, 2019, at 5:30 AM, Justus Winter 
<justuswinter(_at_)gmail(_dot_)com> wrote:


[…]

In the context of processing OpenPGP data, currently there is no
relation between the size of the encrypted message and the size of the
decrypted message.  This is due to compression.


This isn’t precisely true. Certainly, compression is the biggest factor here, 
but it is not the only one. There are many factors that make it hard to know 
the finished size of an OpenPGP message a priori. These include ASCII armor, 
TEXT mode plaintext, and others. It only gets worse inside the plaintext 
where there is typically in emails quoted-printable, further base64, both, 
and other bits of brain damage caused by the accretion of many things that 
were good ideas at the time.


For me, using an unbounded amount of memory is not an option for a
component processing OpenPGP data if we want to build robust systems
on top.


Okay, prior to this working group, when there was running code without a 
consensus rough or not, this problem existed. Even with compression, PGP 2 
ran on DOS machines with a max of 640K of RAM. There are many similarly 
constrained systems that run OpenPGP implementations.


Therefore, we need to process OpenPGP data in bounded space.  Since
there can be no relation between encrypted and decrypted message size
due to compression, the only option I see is to provide a streaming
API, which let's us process data in constant space.


I’m not quite sure what you mean when you say “bounded space” because one 
interpretation of that is obviously false. OpenPGP has always supported being 
able to process messages where the encryptor does not know the size ahead of 
time. That’s why we have indeterminate lengths and chunking.

I presume you mean that the implementation has to have constraints on its 
resources. This is certainly true; there are no unconstrained systems. It’s 
also true that there are going to be messages that your implementation can’t 
process well. For example, RFC 4880 allows a partial body length (a chunk) to 
be 2^30 octets, and that could be irritating to handle.

One of (perhaps unstated) goals of OpenPGP is that it allow for highly 
constrained implementations. This was a huge consideration in both 2440 and 
4880. There are things that are designs the way they are because the working 
group felt strongly that things have to be one-pass. There were many debates 
about the MDC that boil down to it (and this is also the reason why HMAC 
wasn’t used, but there more history there, including that while HMAC existed 
when the MDC was designed, we did not yet have a proof of security for it.


[Now, when I say constant space, implementations could still decide to
use, say, 30 megabytes of buffer space.  Then, most emails will fit
into this buffer, and we can detect truncated messages before we hand
out one byte to the downstream application.  This is what we do in
Sequoia.  Note, however, that the consumer decides how much data to
buffer before releasing the first data, and not the producer.  If we
decide to even allow 128 megabyte chunks, than the producer can
*force* the consumer to allocate 128 megabytes, or either not process
the message or do it unsafely.]


Or the consumer could return an error and say it can’t decode it.


Now, as efail demonstrated, we need to protect against ciphertext
modifications, and we need to do it in a way that does not bring back
the problems with requiring unbounded space that we're trying to
address with streaming in the first place.


Efail is primarily a problem with MIME encoding and layering violations. It 
works just as much with S/MIME as OpenPGP. Perhaps I’m missing something, but 
I don’t see how Efail is relevant to resource bounds.


Therefore, we need to use chunking and authenticate message prefixes.
We need to use chunks that are reasonably small, and this size should
preferably not be configurable.  We should consider performance
constraints and pick one suitable size.  Configurable chunk sizes
bring complexity and increase the attack surface, as was pointed out
in this thread.


I’m with you on a lot of this, but I don’t know what you mean by 
“configurable”? Do you mean that there should be one chunk size only? If so, 
what size do you propose? 32 Meg or thereabouts (2^25 is in that ball park)? 
If so, would that mean that all messages smaller than your chunk size would 
be a single chunk?

The only argument for a configurable chunk size that came out of this
thread is to be able to fit the entire message into one chunk.


That’s not the way I understand the discussion. The way I understand it, 
there are people who desire to have single-chunk messages of a rather large 
size. At present, the non-AEAD chunks can be any power of 2 up to 2^30 (but 
the first one has to be at least 2^9). I don’t see the request for variable 
(is that the same thing as configurable?) chunk sizes to be anything other 
than the analogue of the present situation.


I appreciate the desire to protect against truncation.  But,
truncation is pretty common when we transmit data, so I'd argue that
application developers are more likely to expect and gracefully deal
with truncated data than with ciphertext being manipulated or the PGP
implementation consuming unbounded amounts of memory.


Does this mean that you think that message truncation is an error that 
OpenPGP doesn’t need to guard against?

That’s the way I interpret the first line in the paragraph above (“I 
appreciate … But,…”). If so, that’s counter to the long-standing consensus of 
the working group. It’s the whole reason we have MDCs and the reason why they 
were aggressively pushed in the implementations and non-MDC packets 
browbeaten into doing MDCs. See the non-normative discussion in section 5.13 
of 4880.


Now, you may say that even if the PGP implementation doesn't buffer
the plaintext, the downstream consumer must buffer it in order to
detect truncation.  But that is not always true.  As pointed out in
this thread, you can use some kind of transaction scheme to only
commit data once it has been confirmed to be not truncated.


I think I understand. Are you noting that because of the one-pass nature of 
OpenPGP, it’s possible to process arbitrary amounts of data and not know that 
there’s an error until the end? This is certainly true of MDCs, because of 
the one-pass desire. If you make an implementation that has AEAD chunks, it’s 
possible that you could be processing correct chunks for an indefinite amount 
of time, and then get an AEAD failure that calls into question the integrity 
of the whole stream that led up to that.

Is that what you’re pointing out?



I have implemented AEAD in Sequoia, and I have evaluated the
implementations in GnuPG and RNP.  Every implementation is either
unsafe, not robust, or does not implement the proposal.


Tell us more. What problems did you find?


What is proposed in RFC4880-bis06 can not be implemented safely.  If the
working group produces a standard that cannot be implemented safely, I
consider that a grave failure of the standardization effort.


Okay, you’ve lost me.

What can’t be implemented safely and why?

In my reading of this, I think I have identified two points you’re making.

(1) It’s possible for a chunk to be larger than reasonable processing 
resources.
(2) It’s possible for a long stream to have an error in the last chunk that 
signals an error wayyyyyy in the past.

Handling (1) is reasonably easy. Return an error. This situation exists 
today. It’s possible to make partial bodies of a gigabyte each, and an 
implementation may not be able to handle that. Return an error.

Handling (2) is also easy, you return an error. This might be unsatisfying, 
because the error might be in the past, and lots of stuff already handled. Is 
this your objection?

      Jon

_______________________________________________
openpgp mailing list
openpgp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/openpgp


-- 
Dipl.-Math. Marcus Brinkmann

Lehrstuhl für Netz- und Datensicherheit
Ruhr Universität Bochum
Universitätsstr. 150, Geb. ID 2/461
D-44780 Bochum

Telefon: +49 (0) 234 / 32-25030
http://www.nds.rub.de/chair/people/mbrinkmann

_______________________________________________
openpgp mailing list
openpgp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/openpgp