Hi Jon,
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, April 15, 2019 5:00 PM, Jon Callas <joncallas(_at_)icloud(_dot_)com>
wrote:
On Mar 30, 2019, at 9:11 PM, Bart Butler
bartbutler=40protonmail(_dot_)com(_at_)dmarc(_dot_)ietf(_dot_)org wrote:
[...]
OpenPGP is in general the latter case rather than the former. I believe
it’s less important to have strict semantics on failures because it’s
usually storage.
I agree. I would say my point is that with sufficiently small chunks, the
user/decrypter can choose what kind of failure behavior is appropriate..
Large chunks robs the decrypter of that.
We are mostly in violent agreement, I do believe. I feel like I'm saying
something like "a quarter is a coin with George Washington on one side and an
eagle on the other" and you're saying "a quarter is a coin with an eagle on
one side and George Washington on the other." We're talking about the same
coin, with a slightly different point of view.
I wouldn't use a term like "rob" because that assigns value to the condition.
I think there are places where rejection matters and is a Good Thing.. I
think there places where it is not a good thing and is even a Bad Thing.
That's why I was using terms like "strict semantics" and a lot of
conditionals.
I said 'rob' because I think fundamentally that the release semantics should be
something that is decided by the decrypter, not the encrypter, as only the
decrypter knows what kind of release semantics are safe or not. For example, I
have a 32 MB PGP/MIME message. I want to show a preview in my email client. If
we use 8K chunks, I can read the first chunk, know that it hasn't been messed
with, and display it safely. If the spec allows a 32 MB chunk, as an
application developer I have some choices:
1. I can load the entire 32 MB and be really slow/bandwidth intensive
2. I can not show a preview for this message
3. I can ignore release semantics and do it anyway, risking the Problem That
Shall Not Be Named
All of the options are terrible for a UX perspective. Meanwhile, if the chunk
size is capped, this makes it easy, and if I, as an application developer, need
strict release semantics for the entire file/message, I can do that too.
Now, with your proposal, the other implementations and I can come to some
agreement that hey, we just aren't going to allow chunks meaningfully higher
than the cap, what you call "normative" agreement. That's fine, but I'm worried
that these norms don't tend to be well-documented (I'm not sure the MAY in the
RFC will be sufficient), and someone somewhere is going to write an
implementation at some point which exclusively uses big chunks. When they do,
our implementations will reject them, and then their users will complain to the
app developers, who will in turn complain to the implementers.
I'm certainly not so arrogant to assume I can anticipate all future needs here.
But I think it's telling that we can come up with several negative consequences
of allowing the large chunks and the only benefit is something that can be
achieved at the application layer (or as an option at the implementation layer
even) if desired anyway.
I also think that forcing no-release semantics via packet structure is
misguided because app developers/implementors are likely to ignore if it
becomes too annoying. That is, I anticipate some implementers just allowing
some kind of unsafe mode that releases plaintext early with no integrity checks
if this comes up (essentially streaming not along AEAD chunk boundaries), and
I'm in general uncomfortable with choosing to build a feature whose failure
case is "massive security hole", not to mention one that we've seen before with
the Problem That Shall Not Be Named. Do we want to allow people to create
messages which *cannot* be safely streamed when we have the choice not to do
this with zero functional downside? Strict release semantics can always be
enforced at the implementation or application level.
I don't want to bury my lede any deeper than this. What I'm saying is:
- The more you want strict AEAD semantics of no-release, the fewer chunks
you want.
- It seems to me that the people who most believe in strict AEAD release
are also the ones who are arguing for smaller packets. These seem to be in
opposition to each other. I've been confused through this discussion because
the rationales seem in opposition and confused. I don't get it, and I want to
understand; you all are smart people whom I respect, so if I'm confused,
maybe I'm not getting something.
This feeling is completely mutual. I respect everyone in this discussion and
know that all of you are smart people. I will try to rephrase what I think is
the fundamental question here, and it's not what release semantics should
be--those can be enforced in lots of places, and as you said, vary by use case,
which is very compatible with my views. I think the fundamental question here
is this:
*Should we allow creation of valid messages which cannot be streamed and
attempt to force strict no-release at the protocol layer?*
I think, in the absence of a compelling reason to, the answer to this is a
pretty clear no.
We might differ in that I have a nuanced opinion about AEAD rejection. I
think that there are places where it matters, and places where you don't. For
example, in networking, particularly the parts of the network stack where you
can easily get a forged packet. You want to reject that packet as early as
possible. Moreover, these places are always using very small packets. (I'm
going to wave my hand and say that under a megabyte is "very small" for these
purposes.)
But in archival storage, you don't want to reject something because
there's a media error, you want to recover as much as possible. You might
even be required to do so by law. I have real-world anecdotes if you want to
hear them.
On a network, rejection is a good thing. You reply a NAK to the sender
and they retransmit. In archival storage, there's no retransmitting on a
media error. That's the case where it's a Bad Thing, and in fact, it might
even be better to use CFB mode and an MDC than AEAD. It also might not, and
much depends on which AEAD mode one used.
Nonetheless, if you believe in strict semantics, you also likely want the
fewest number of chunks. If there is more than one chunk, you have to stage
the output, you have to process everything (unless you're going to say that
the timing side-channel is not important)
Why do you have to stage the output in the multi-chunk case? The only
difference in the multi-chunk case is that I'd check AEAD tags multiple times
instead of just at the end. There's no reason why I'd have to do anything with
the output differently than a single chunk if I embrace strict no-release. I
could buffer it the exact same way I was buffering the single chunk and the
application/consumer doesn't have to know there is any difference.
Fundamentally, multi-chunk just gives you options. There is nothing stopping an
implementation from doing strict no-release.
Sometimes this is not possible. Ironically, the place where it's most
possible is in storage, where it's the least needed. In online protocols,
OK, I think this is the part that I don't understand. Why does it matter
what chunking scheme is used here? If my app requires all-or-nothing
semantics, I would program my app to enforce that all chunks must pass and
not release plaintext unless that happened, with no truncation, etc. So why
would every joint be a vulnerability?
What value does large-chunk AEAD actually provide? What I'm getting
from the AEAD Conundrum message is that it's a way for the message
encrypter to leverage the "don't release unauthenticated chunks"
prohibition to force the decrypter to decrypt the whole message before
releasing anything. Why do we want to give the message creator this
kind of power? Why should the message creator be given the choice to
force her recipient to either decrypt the entire message before release
or be less safe than she would have been with smaller chunks?
Let me summarize the conundrum: If you want strict AEAD no-release
semantics, you want a fewer number of chunks.
I guess this is my fundamental question. You can force no-release semantics
at the application level for any chunk size scheme, right?
Yes, you can, provided that there's a way to report that back, and your
caller checks the return value.
You (as an implementation) could just not return the plaintext until the entire
message was read. There's nothing stopping implementations from having a strict
no-release mode.
I suppose this really means no, you can't force it, because the library
writer can't force the application code to check the error return.
Well, the library can always just not return the plaintext if we don't think
it's safe. I just don't think it's the encrypter's business to be deciding what
is safe or not for the decrypter.
I have heard that some issues that we're Not Going To Talk About had among
the issues improper checking GnuPG's report of an MDC failure was an issue in
at least one place.
Sure, but this could have been configured as a hard failure. The apps didn't
configure it as a hard failure because that would have collided with
UX/application concerns, and I fear that that collision will occur again if we
allow it to, with likely the same result.
If you respond to a security request with a performance answer, you
literally don’t know what you’re talking about. So let’s toss that aside.
I apologize, I was not trying to create a strawman here, but I am
completely at a loss for what the benefit of large chunks is.
From a standpoint of debate technique, coming up with a strawman makes your
whole side of it weaker because attacking a strawman is attacking a strawman.
It makes it look like you don't understand, when you actually have a
different issue. I think it has added to the confusion I have been suffering
from. The chunk size question is about adjusting security parameters, and
thus when you say, "it won't help performance" I can't help but think that
we're not discussing the same thing at all, as I'm talking security, and
you're talking performance.
Good to put that to bed. Back to the chunk size debate.
I don't know the specific benefits, either. I heard people asking for it, and
I'm defending the idea for them.
I believe that an underlying difference between your thinking and mine is
that you're looking at this as an application writer, and I'm looking at it
like a protocol / API that has many clients, some of whom (and the largest
ones) aren't written yet.
Moreover, there are a lot of people who use OpenPGP for a lot of things that
we don't know about. As Peter Gutmann pointed out there are a lot of EDI
systems, back ends of financial systems, and so on that internally use
OpenPGP implementations. They're not here. I'm trying to watch out for them.
There are also people around who want to do something and for a lot of
reasons find it difficult to speak up. I'm not editor any more, Werner is and
I have every faith in him. Sometimes, though, old habits die hard.
I'm sympathetic to all of this, and I don't want to put anyone on the spot.. It
would be really great if anyone who has a use case for large chunks speaks up
though, either through this thread or privately to me, Jon, or anyone else they
feel comfortable speaking with, because I do not want anyone's voice to not be
heard, and if there is a use case for large chunks I do want to hear about it
before this decision is finalized.
I tend to see the AEAD packet format as being a successor to the existing
streaming, indefinite length things. That allows chunking up to 2^30 and
while absurdly large, it has never been an issue.
Well, except that streaming this old stuff is unsafe if ciphertext modification
is a threat.
In my head, I think why not allow up to that, since it would preserve
anyone's weird thing?
On the other side, implementers need guidance. Today, the guidance is
folklore with all the issues that go with it. It's better not to have
folklore. But, if we basically said, "do what you're doing today" then we'd
be looking at 8K chunks, as that's what GnuPG does today.
The clauses I suggested about MAY support larger / MAY give larger the finger
seemed to be a compromise that would work because it gives you the guidance
you need; it lets whoever these people are the ability to do what they want;
and lastly should there be a consensus that it needs to be larger in the real
world, a consensus of implementers can change it without a new document. It
seemed to me that everyone wins.
For the record, I'm pretty much OK with this, I just think it's opening us up
to future problems that it would be best to avoid.
Yet I thought I perceived that you not only wanted to win, but you wanted to
salt the earth in the other people's territory. Fixing an upper bound on
memory has a long history of Famous Last Words going back to the old clichéd
"640K is more than enough for anyone." The gods punish hubris.
I'm sorry I gave that impression or was overly strident. I consider this a rare
opportunity to fix something before it becomes a problem rather than afterward
with a bunch of legacy baggage in tow. I have no interest in "winning" this
argument for it's own sake--I would be happy to get a counter-argument for
large chunks that made me think "yes, there is a use case and that's why we
want to risk having these future problems".
Okay -- let's sort all this out. I really think we are ALMOST done here.
Here's what I stated before.
(1) MUST support up to <small-chunk-size> chunks.
(2) SHOULD support up to <larger-chunk-size> chunks, as these are common..
(3) MAY support larger up to the present very large size.
(4) MAY reject or error out on chunks larger than <small-chunk-size>, but
repeating ourselves, SHOULD support <larger-chunk-size>.
Clauses (3) and (4) set up a sandbox for the people who want very large
chunks. They can do whatever they want, and the rest of us can ignore
them.. Why get rid of that? It doesn’t add any complexity to the code. It
lets the people who want the huge ones do them in their own environment
and not bother other people.
My concern is over (1) and (2) and specifically that there’s both <small>
and <large> sizes.
I think that’s an issue. If there are two numbers we are apt to end up
with skew before settling on one, so it’s better to agree on just one.
That’s the real wart in my proposal.
I'm OK with eliminating (2) and just using the MAY part to take care of any
legacy 256K messages OpenPGP.js users might have. As I said, we don't have
any of these messages in production yet and I'd err on the side of a
cleaner spec.
Me too. I think saying 256K is fine. I have an intuition it ought to be at
least as large as the largest Jumbo Frame, and that's 9K so round to 16K. Let
me restate the proposal.
(1) MUST support up to <chunk-size> chunks.
(3) MAY support larger up to the present very large size.
(4) MAY reject or error out on chunks larger than <chunk-size>
And it seems that 256K is the proposal for <chunk-size>. Are we agreed on all
that?
As some respondents would like 8K or 16K, I'm fine with doing that instead of
256K. I would like to check with the maintainers of our libraries to find out
if there's any reason I'm ignoring that would favor one or the other before
committing though.
I just really want to understand the benefit of large chunks for security
and right now I clearly do not.
If you believe that no-release is a Good Thing, then you want fewer chunks,
ideally only 1 chunk. That's it. That's the ONLY reason.
I think I discussed this to death above so I won't add to the word count here.
-Bart
I believe that no-release can be a Good Thing, but rarely is for OpenPGP's
primary use case. As I said in my other missive, I don't think that it's even
possible in the general case. Networking packets, yes -- both possible and
desirable. Files, no -- neither possible nor desirable.
Jon
signature.asc
Description: OpenPGP digital signature
_______________________________________________
openpgp mailing list
openpgp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/openpgp