Re: [openpgp] AEAD Chunk Size

On Mar 29, 2019, at 2:07 AM, Neal H. Walfield <neal(_at_)walfield(_dot_)org> 
wrote:

On Fri, 29 Mar 2019 03:19:39 +0100,
Jon Callas wrote:

Like some interim replies, particularly Bart Butler, I thought we had a 
rough consensus and that it was approximately:

* MUST support 16KB chunks.
* SHOULD support 256K chunks, as these are common (Protonmail).
* MAY support larger up to the present very large size.
* MAY reject or error out on chunks larger than 16KB, but repeating 
ourselves, SHOULD support 256K.


I still think this is a bad idea.


Noted. It was a position that I threw out because I thought it encompassed a 
lot of things that people said and also allows you, the implementer to say, 
“screw it, we’re doing 256K only and to heck with all of you” and it would meet 
the standard. You could even implement 64K, thus generating a brinksmanship 
debate with you and Protonmail and I don’t have to listen to it.

I think that my proposal above is flawed because there’s this squishy space 
between 64K and 256K, and really hoped someone would either say, “Fine. I’ll 
give up on 256K” or “Fine. 256K is fine.” And then we’d modify the proposal and 
probably be done. It was, however a position I thought we might reason from.

I could also get behind a hard limit of 2^30 on the grounds that that’s what 
we had for partial body lengths, but I understand the comment that there are 
things like multi-terabyte S3 buckets and out and out forbidding those to be 
single-chunk is a bit churlish, but only a bit.


This is the bit that I don't understand.  Clearly you see some
advantage to having large chunk sizes.


Well, actually what I was doing there was offering a number lower than 
2^whatever for the max max max oh-dear-god-why-do-you-want-to-do-this size at a 
(to me) very, very generous gigabyte to the other side on the max size that you 
can then safely ignore.


But, let me try to use your framing:

I propose that we remove the chunk size parameter from the AEAD header
and fix it to a small value (e.g., 64 KB or 256 KB), because 1.) no
one has shown that a large chunk size is useful, 2.) large chunk sizes
encourage implementations to release unauthenticated plaintext, which
is a serious security concern [1], 3.) if an implementation releases
unauthenticated plaintext for large chunks, then an attacker can
always convince it to release unauthenticated plaintext for the first
chunk in a message [2], 4.) making the chunk size variable increases
complexity, which is a security concern.


This isn’t actionable.

There’s a lot of rationale here, but not an actual proposal.

My proposal (left at the top of this missive for easy referral) was a concrete 
proposal. The above paragraph has a lot of justification in it and is 
interesting polemic but it is not an actual proposal.

If you had said:

* Chunk size is 256K. Always. If you have less than 256K, pad it with zeroes.

Then that would be a proposal.

As I understand your position, you want to allow implementations a
broad degree of flexibility in choosing the chunk size.  Can you
please explain why this is useful?  I've reread your messages, but
beyond what appears to me to be some hand wavy performance argument,
and an apparent misunderstand (chunk size != message size) [3],
neither of which are convincing, I can't find any other arguments.
I'm sorry if you did and I didn't understand.


You don’t understand my position.

My position is that there are a bunch of people (like you, at least I think 
you) who want to build a general-purpose implementation that you hope will be 
used all over the place. You have a legitimate need for small chunks.

There are other people who observe that there’s also a legitimate need for huge 
chunks because they want to do storage encryption on very large things and — 
well, they have considerations that I’m not paying a lot of attention to.

My proposal above lets them do their thing and lets you ignore them. That’s my 
position. Making as many people happy as possible.


I'm a bit confused.  In my original mail, I included a PR:

 https://mailarchive.ietf.org/arch/msg/openpgp/XH098WlJe8lkOypIaB1IXTde2sY

Was that not actionable enough?  Should I link to the PR more often?


It would help, but even better would be to send what the proposal is. I really 
did read it and it’s a patch file. Patch files are hard for humans to read. 
It’s really nice to have a redline, but it’s also nice to know what the full 
thing is without having to mentally emulate git.


 I've spent some time thinking about use cases for different chunk
 sizes, and I can't come up with any modulo some, IMHO, insignificant
 performance tweaks.  Can you please give some examples of use cases
 that would profit from different chunk sizes?


I’m not the one proposing the large chunk sizes. And as I understand the people 
who are doing it, performance isn’t their issue, data integrity is.

There’s a dilemma in here.

One is that there’s a belief that


 What should / would you recommend an implementation do if it
 encounters a chunk that it can't buffer?  I see two choices: report
 an error, or release unauthenticated plaintext.


Report an error. I’ve said that many times.

Regarding our implementation: it doesn't actually have tight resource
constraints.  Our primary goal is to write a secure implementation.


Okay, so you’re arguing resources when you don’t really mean it. Gotcha. I 
thought that was the case that the things you were saying didn’t jibe together.

From that goal, we derived the constraint that we must always work in
a bounded amount of space.  The current version of AEAD doesn't allow
this without potentially rejecting messages that other implementations
can process by being insecure (releasing unauthenticated plaintext),
which we don't want to do.


That constraint is a long-standing meta-requirement in OpenPGP, so cool.

However, I detect that there is an impossible-to-solve problem here, and I’ll 
outline it in another note later.


But, my concerns are not only about my implementation.  I'm concerned
about the ecosystem.  And, the current proposal encourages insecure
implementations as demonstrated by GnuPG and RNP processing
unauthenticated plaintext.  I think the standard should not
proactively make it harder to write a secure implementation.  And that
is what I see the AEAD chunk size doing.


I think I understand where you’re coming from, but this is a standards 
organization. Standards are concerned with interoperability between 
implementors. I learned that from Jeff Schiller ages ago. Think of a standard 
as being like a dictionary and grammar. When you send me a message, I use that 
grammar/dictionary to determine what it means. Similtharly, if I want to encode 
a message that I want you to understand, then I encode it according to that 
grammar as well.

My proposal above says to an encoder that the safest thing to do is to use a 
chunk size of 64K. Everyone can decode that, and you’re fine. Many, many, many 
of them are going to be okay with 256K, and odds are that anyone of any import 
will do fine with it. (That’s what the SHOULD means). It also says there are 
people out in Lala-land who might be playing with chunks that are out to 
2^whatever, and that’s kinda cool, but ignore them. 

That’s why I made it. It gives a sandbox for people who want to go out on the 
edge that the mainstream can ignore.



Thanks,

Neal

P.S. I hope it is clear that I'm trying to engage in a constructive
manner.  I sincerely haven't understand your position, and I really
want to.


I think that’s because I’m trying to support a community of people who have 
reasonable, differing needs, and you want to mandate your implementation ideas 
on the universe. I apologize if that’s harsh, but it’s what I perceive. If I’m 
mistaken, let me know.

        Jon


_______________________________________________
openpgp mailing list
openpgp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/openpgp