Re: [openpgp] AEAD Chunk Size

Hi Neal & others,
Tried to follow the discussion as much as possible, but probably missed some 
pages.

Independent on chunk size, we get to the issue of large message transfer, which 
in practice is used widely (large corporations, banks, media companies, 
whatever else).
The case with small chunk size and damaged last chunk is actually (almost) the 
same as the case with the single large chunk - you output, say, 20GB of data 
and then realize that it is broken. The same would happen with streamed 
transfer of large signed objects.

If you have data stored in file/memory/database you may first check it, and 
then decide whether it is ok or not, spending resources on two passes.
But while streaming large amounts of data which doesn’t fit to memory there is 
no choice - you still need to process it in smaller blocks, without guarantee 
that error will not come later on. Underlying compression or other known data 
structure comes in handy - it will serve as additional layer of security (not 
100% of course) for transmission errors or transfer intervention.
In any case you first output a load of data and then must discard it.

So, my opinion is that we may leave things as they are, or just extend with 
recommendations based on performance. 
Given the fact that now not 100% of implementations support AEAD even partially.

On Mar 19, 2019, at 21:10, Neal H. Walfield <neal(_at_)walfield(_dot_)org> 
wrote:

Hi Derek,

Thanks for your analysis.  I think the AEAD chunking algorithm is
sound, if it is used correctly.

My issue is that I don't think it is possible to use the chunking
algorithm correctly for large chunk sizes.  For instance, what should
an implementation do if it encounters a chunk size of 16 TB (and there
really can be >16 TB of data using a small decompression bomb)?
Should it be allowed to emit unauthenticated plaintext?

Let's assume that we allow implementations to emit unauthenticated
plaintext for large chunk sizes.  An attacker who intercepts a message
can change the chunk size to be above this limit.  Now, clearly this
will fail the authentication check, but we just allowed
implementations to release the plaintext for large chunk sizes.  So,
if my analysis and experiments are correct, then using AEAD will never
provide ciphertext integrity for the first chunk against an active
attacker.

Let's assume that we don't allow implementations to ever emit
unauthenticated plaintext.  Well, if they encounter a large chunk that
they can't buffer, they MUST fail.  Ouch!  User's will work around
that and implementations will help.  (As I've pointed out, one already
has.  I don't know if it was an oversight.)

So, my conclusion is, we must prohibit implementations from emitting
unauthenticated plaintext *and* remove any incentives to do so.  For
me, this means a small, fixed chunk size.


Does this clarify my concerns?

Thanks!

:) Neal

At Tue, 19 Mar 2019 12:21:36 -0400,
Derek Atkins wrote:


Neal,

"Neal H. Walfield" <neal(_at_)walfield(_dot_)org> writes:

Well, whatever :).  As I understand Werner, he agrees that ciphertext
integrity while streaming is desirable, and, I guess, thinks that is
achievable with the current draft.  I don't think it is possible, and
I've raised a few concerns in my prior mails.  I hope Werner can
address those concerns.


I admit that I have not studied the current draft closely, so my
comments might be out of whack, but my expectation is that the AEAD and
Chunking are done hand-in-hand.  The chunk metadata (number/size/etc)
should be protected by the AEAD.

So let's say you have a message that gets processed as:

[Chunk 1] [Chunk 2] [Chunk 3/final]

In the normal case, the receiver will process Chunk 1, then Chunk 2, and
then Chunk 3, and depending on the implementation MAY emit the Chunk 1
plaintext (which HAS BEEN AUTHENTICATED) before it processes Chunk 2
and/or Chunk 3.

What can an attacker do?

* If they modify the chunk size, the AEAD on that chunk will fail and
 therefore the data should not be emitted.  Note that this COULD cause
 the receiver to emit Chunk 1 and then "stop".

* If they cause Chunk 3 to disappear, they could get the receiver to emit
 Chunks 1 and 2 and then stop with.  But again, Chunk1 and Chunk2 are
 still authenticated.

* If they cause Chunk 2 to disappear, the numbering will be out of order
 and the receiver can emit an error.

* They can insert fake chunks.  Let's say they cause the following:
 [Chunk 1] [Chunk 2'] [Chunk 2] [Chunk 3'] [Chink 3/final]

 In this case, the attacker is inserting Chunk 2' and Chunk 3',
 claiming to be chunks 2 and 3 respectively.  So the question is, what
 happens here?  Well, the chunking here is expecting encryption (at
 least I think it should be expecting encryption).  So they would
 authenticate Chunk 1 (and possibly emit it), but when processing Chunk
 2' they would fail on the AEAD because the attacker doesn't know the
 key.

 The chunking should not allow any other type of data here.

Am I missing another attack?

[snip]

Which implementation?  The sender or the receiver?


I'm assuming that the sender is using the best we have to offer, i.e.,
chunked AEAD.

I'm assuming that the receiver emits unauthenticated plaintext in some
situations.


In what situations would the receiver emit unauthenticated plaintext?

Also, who is the
attacker?  The Sender?  Or a third party?


I'm thinking of something like EFAIL.  So, a third-party attacker who
modifies the ciphertext.


EFAIL was more that that -- it was also leveraging the fact that USERS
of OpenPGP would merge the contents of authenticated and
non-authenticated data when presented in e.g. a MIME context, such that
the processor could not differentiate between the protected and
unprotected content.

The only attack is a DoS if the end is
truncated, and the earlier (valid) chunks have already been released.

In other words, AEAD is used for each chunked block as a single, coherent
piece.  Also I don't see how an attacker (third party) could leverage this
at all.  They couldn't change the chunk size enroute.


I hacked Sequoia to emit unauthenticated data, and I was able to
change the chunk size and still get the plaintext of at least the
first chunk of a message.  So, my concern is: if a receiving
implementation releases unauthenticated plaintext for large chunks
rather than fail with ENOMEM, then a third-party attacker can change
the chunk size of an intercepted message, and cause the receiver to
process unauthenticated data, even though a small chunk size was
originally used.


Sure, but the first chunk of the message was authenticated when it was
emitted.  The SECOND chunk failed, and I bet it stopped processing!

This is what SHOULD happen.

Each chunk is either
authenticated, or an error occurs.  On error, it wouldn't be released.


The question for me is: what does an implementation do if it can't
buffer the message?  Does it really throw an error?  There already
exists at least one implementation written by an editor listed on
4880bis that processes unauthenticated plaintext.


I think we have different definitions of unauthenticated plaintext here.

In a chunked message, plaintext is authenticated PER CHUNK.  So once
Chunk 1 is processed, it is considered authenticated, and the
implementation is free to emit it, even if Chunk 2 fails.

If Chunk 2 fails its AEAD check (but Chunk 1 processed successfully), do
you consider the plaintext of Chunk 1 unauthenticated?  If so, why?

-derek

-- 
      Derek Atkins                 617-623-3745
      derek(_at_)ihtfp(_dot_)com             www.ihtfp.com
      Computer and Internet Security Consultant

_______________________________________________
openpgp mailing list
openpgp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/openpgp


_______________________________________________
openpgp mailing list
openpgp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/openpgp


_______________________________________________
openpgp mailing list
openpgp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/openpgp