Re: [openpgp] Disabling compression in OpenPGP


On Mar 18, 2014, at 9:00 AM, Alfredo Pironti 
<alfredo(_dot_)pironti(_at_)inria(_dot_)fr> wrote:

Dear list,

It is well known that compressing data before encrypting them leaks much 
about the plaintext [1]. Recently, this has been exploited against the TLS 
protocol in the so-called CRIME attack [2].

Looking at RFC 4880, section 2.3, I read
“OpenPGP implementations SHOULD compress the message after applying the 
signature but before encryption.”
And indeed, gpg faithfully follows the spec by enabling compression by 
default.

I have done some preliminary work on password managers that rely on OpenPGP 
(gpg, in fact) to encrypt the passwords. Unsurprisingly, it turns out that 
compressing the password before encrypting it leaks much of the password 
entropy, making dictionary attacks significantly easier to mount. (In my 
preliminary experiments I used a password dictionary containing about 4 
million passwords. If the attacker knows the original password length and its 
compressed length, then for some combinations of the two the candidate 
dictionary entries can reduce to as few as some hundreds.)

I believe similar attacks can be mounted in different contexts where OpenPGP 
is used. Hence, I propose to start discussion to amend RFC 4880 to at least 
discourage (if not forbid) the use of compression.

I welcome comments and suggestions.
Alfredo Pironti


I think you're confusing a number of things, as well as cherry picking your 
data.

On the other side, there's the Katz-Schneier attack (which is more of a CFB 
attack than OpenPGP per se) that is essentially thwarted by compression. There 
are other attacks where compressing helps because it obscures plaintext. I'm 
eliding much, so go look at it.

OpenPGP is in general not an interactive or on-line protocol, and there are all 
sorts of attacks that are easy against interactive or on-line attacks and not 
against off-line. CRIME is one of them -- it exploits the interactive nature of 
TLS and makes repeated connections to learn something about the plaintext. It 
isn't the *compression* that does it, it's deltas between similar requests. You 
don't get to do that with OpenPGP. Note also that there are two easy 
mitigations against CRIME -- one is to disable compression and other other is 
to restrict the interactivity of the attack.

Similarly, most side-channel attacks aren't possible when all you have is 
ciphertext and you're attacking it yourself. TLS is an on-line protocol and 
there are many, many attacks against it that simply don't apply to blob 
encryption, which is what OpenPGP is.

There are a number of compression oracles that exist because of carelessness. 
Kelsey's paper that you quote is a fantastic paper because it flew in the face 
of the naive belief that compression was always good (on the grounds that it 
makes the plaintext less predictable). Compression is *often* good for that 
very reason. Compression is sometimes bad. Asserting that because compression 
is sometimes bad, it must always be bad is stretching the point, in my opinion.

If you believe that there are attacks that can be mounted OpenPGP because of, 
find one. You'll have a great paper.

Even better, come up with a generalized attack. Here's a framework that you can 
use:

OpenPGP is mostly bulk CFB encryption of a data blob. There are headers and 
stuff (which could make it easier or harder depending on data leaks), but at 
the core of the protocol, it's just CFB. OpenPGP compression is mostly ZIP or 
GZIP compression. Again, there are details with headers. In this case, the 
header issue might be significant, as OpenPGP often deletes some normal ZIP 
headers which could be used as known plaintext.

Nonetheless, here's where you should put your efforts:

* Take a file or other data blob. Call it P (for plaintext). 

* Compress P with ZIP or GZIP or even BZIP. Call that Z (for zip).

* Encrypt that with a simple CFB encryptor using some key K and some suitable 
cipher; I'd pick AES. Call that C (for ciphertext).

* Use comparisons between P, Z, and C to mount an attack on K -- giving a 
key-recovery attack. Or bypass the key and cipher and use knowledge about 
fragments of P and Z to undo pieces of C -- which would be some form of known 
plaintext attack. etc.

This gives a simplified model of what's going on inside OpenPGP, and might be 
easier to work with. Note, however, that OpenPGP typically throws away Z -- 
it's computed as an intermediate value and discarded. So it's conceivable that 
there's an attack that's possible but hard to mount.

Alternatively, you could use compression options in the gpg command line to get 
ciphertext of compressed and uncompressed outputs and do the same thing.

You describe an interesting case where there's a structured file of passwords. 
You could look at cases where the attacker would have knowledge of the 
plaintext or parts of it in varying ways. For example, it could be a sorted 
file, or it could contain all N-character passwords in a group (that might be 
itself sorted) and so on. 

You could also devise interactive and non-interactive versions of your attack. 
Note that the non-interactive ones are more interesting, of course. 

I hope this helps.

        Jon


_______________________________________________
openpgp mailing list
openpgp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/openpgp