Re: [openpgp] [Cfrg] streamable AEAD construct for stored data?

On 7/11/2015 01:46 am, Bryan Ford wrote:

To be clear, there are two separate use-cases, each of which make sense
without the other and require different technical solutions (but could
also make sense together):

1. Streaming-mode integrity protection:  We want to make sure OpenPGP
can be used Unix filter-style on both encryption and decryption sides,
to process arbitrarily large files (e.g., huge backup tarballs), while
satisfying the following joint requirements:

(a) Ensure that neither the encryptor nor decryptor ever has to buffer
the entire stream in memory or any other intermediate storage.


Yes.

(b) Ensure that the decryptor integrity-checks everything it decrypts
BEFORE passing it onto the next pipeline stage (e.g., un-tar).

ok. So this is where a program-level option comes in. In streamingmode, the streamer can keep decrypting and passing it across to thereader, and then break when an integrity check fails.

In streaming mode, this is how we would expect it to operation. A userprogram can however offer some options in this case. Eg., do anintegrity check pass before hand as a separate option; and turn theintegrity checks into warnings, keep decrypting the data, knowing thatthere is garble in there, keep streaming. Both two useful options aprogram could offer.

So I'd say NO - streaming is streaming, and there isn't a requirement inthe spec to be sure about the entire file before hand. That's just aquirk of the streaming mode that users will have to accept.

2. Random-access: Once a potentially-huge OpenPGP-encrypted file has
been written to some random-access-capable medium, allow a reader to
decrypt and integrity-check parts of that encrypted file without
(re-)processing the whole thing: i.e., support integrity-protected
random-access reads.

Let’s call these goals #1 and #2, respectively.

...

We could very well design an OpenPGP format that addresses both goals
together, if we decide both goals are valuable. ...

There are some obvious tradeoffs here, both in storage and complexity
costs.  I’m not that worried about the storage efficiency costs,...
  And the implementation-complexity is certainly an issue regardless.

Nod. Let's see how the requirements go first, and whether there is areasonable design possible second.

So some questions about this:

1. How important is the ability to achieve goal #1 above in the OpenPGP
format (streaming-mode integrity-checking)?

It's certainly important. If we want to bring everyone across to a newformat, and start ditching the old (from the standard) then we have toprovide an equivalent to common use cases.

I'm inclined to say that stream-mode must be integrity checked. We wantto achieve the same standard across the board, we don't want to say "ifX, then Y, but if the Z, then not Y and maybe W..." and complicate theuser understanding.

2. How important is the ability to achieve goal #2 above in the OpenPGP
format (random-access integrity-checking)?

Random access is a new feature. It's certainly an *attractive* featurefor the inner geek, just because. But I am not seeing a clear use caseas yet, at the user level. If I think about the command line, I can'tsee a way a user would say "decrypt from blocks 1234 to 8960" withoutgetting into some arcane geeky construction like doing dd(1) or somesuch... which no sane end-user does.

What I am seeing is that this would be an API call to other systemswhich do know what they want. This would be quite useful for a backupfor example, or an rsync-like tool. Being able to re-start the backupis incredibly useful, being able to set off a backup to do a sort of"rsync" phased copy from "state N" without phase errors would be fantastic.

We would be then entering into the library space rather than theend-user interface space. This might actually be a good thing, it mighttear our childlike grip from the command line and drag us into the newmillenium in time for the next decade. It might finally kill off ourobsession with email :)

Or it could be mission creep, scope enlargement, or the sinking of theproject if we become all things to all other projects building GUIs on top?

3. For whichever goal(s) we wish to be able to achieve, should those be
*mandatory* or *optional* in the format?

I'd really like to see one format. The boolean logic that goes withdifferent formats just ripples through the users minds and createsconfusions. Every confusion creates loss of users. Every user we loseto confusion is a breach of security because they go on to do itcleartext or some other inadequate tool. If we have 10 such confusionsscattered across the code, we'll probably half the number of users.

That's without even talking about bugs, and security snafus and thepotential for choosing the wrong mode and breaking the lot... E.g., ittook me 2 years to find out the reason why SVN would break every monthwas that the client side was mounted on a Mac OSX drive that had an*option* to select case insensitivity... dozens of mandays lost inrectification/recovery/rebuilding client repos because of an obscure option.

There is a reason the MiB run around and insert multiple-mode madnessinto people's minds in groups. It makes security brittle. It makes iteasy for them to futz.

That is, should *every*
OpenPGPv5-encrypted file satisfy either or both of these goals, or
should they be configurable or user-selectable (such that some encrypted
files might contain per-chunk signatures and/or Merkle trees while
others do not)?  Making either of these goals “supported but optional”
might help mitigate any performance/storage cost concerns with either of
them, but would only further increase the complexity of the overall
OpenPGP spec and increase the “usability risk” of a user accidentally
failing to enable a relevant option when he really should have (e.g.,
streaming-mode protection for backups).

Yup. And then he goes off an uses another tool. Coz the sales forcehave realised that taking options away makes the sale easier, and theuser can't see the schlock under the hood anyway.

4. What are reasonable upper- and lower-bounds for chunk sizes, and what
are the considerations behind them?



Defer to later.



iang

_______________________________________________
openpgp mailing list
openpgp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/openpgp