Re: [openpgp] respecting key flags for decryption

On Nov 8, 2018, at 3:24 AM, Vincent Breitmoser <look@my.amazin.horse> wrote:


Hey Jon,

thanks for your thoughtful reply!


And thank you for appreciating it.

Moreover, gnupg is not merely a tool, it’s a reference implementation and as
a reference implementation it needs to have meta-features for the purpose of
debugging.


I agree with thre premise here, but not the conclusion. In the concrete bug
report I have, a bank(!) implemented OpenPGP in a way that included two major
blunders: 1) they encrypt to a subkey that doesn't have the flag. 2) they
include a one-pass signature packet, but then no actual signature (which isn't
a valid OpenPGP message, by spec).

The reason this happened, I strongly suspect, is exactly because they treated
GnuPG as a reference implementation: they tested that it worked against GnuPG
(or some frontend), found it worked in practice (without even a warning), and
then left it at that.


I think you’re mischaracterizing my conclusion.

My point and conclusion is that you shouldn’t let people who do stupid stuff 
contaminate the ecosystem.

We are starting from people who have screwed up. They screwed up because they 
didn’t understand or they didn’t care. You’re not going to make them understand 
by typing harder and making your keys click louder. Most importantly of all, 
you’re not going to make a stupid person smart by modifying the standard. The 
whole premise that we are starting from is a botched implementation.

My conclusion is that we shouldn’t make working implementations brittle in the 
belief that this will make the botched implementations suddenly correct.

Now I did go off into a rant and a jeremiad, and that may not have been clear, 
because good rants and jeremiads tend to wander. So let me try to be clearer 
while still being entertainingly ranty.

The purpose of a standard is for interoperability. The standard describes a 
language that my software and your software can use so that together we are 
better than either one separately. However, the standard is a map, it is not 
the territory. The territory is the software in part and the people most of all.

If the software is inferior or brain-damaged by following the standard, then 
this is very bad. Sometimes this happens accidentally. It should never, ever 
happen that the standard *intentionally* forgets its place in the world and 
mandates bad experiences for the users. A good software developer makes good 
experiences for the users. Always. If that means breaking the standard, you 
break the standard. Especially when the precipitating action is that someone 
*else* broke the standard.

I never invoked Postel’s principle. If you read what I wrote again, I very 
carefully, *intentionally* did not invoke it. Trust me, I know what it is, and 
I have also read Thomson’s I-D. He has a point, but I think that his point is 
basically just the flip side of the problem. I believe that you’re bringing up 
as a straw man to beat. Nonetheless, let’s examine it for a moment.

All virtues when taken too far end up being vices. Nearly every nourishing food 
is bad for you if you eat too much. Nearly every medicine is a poison if you 
use it improperly. Even moderation (which is what I am preaching here) can be 
taken too far, and be mealy-mouthed wishy-washiness. Sometimes one has to take 
a stand. The questions of course include which stand to take, how firmly one 
takes it, and so on.

There are a lot of people who use (or used) Postel’s principle as an excuse and 
justification for being a lazy programmer, for being a bad programmer, for 
writing misfeatures for being unwilling to fix their bugs. People use it as a 
way to say their shit doesn’t stink, if you’ll forgive my language. When 
someone uses it as an excuse for their bad behavior, that is bad. But it is not 
Postel’s principle that is bad, it is the people using it to justify bad 
behavior that is bad.

Let me digress into an example. This digression is very relevant as it is 
directly analogous to this situation as you will see.

For better or worse, the PGP software did not implement Blowfish. Personally, I 
think it was for worse, but it doesn’t matter. The point is that we didn’t, and 
while the reasons make for a fun anecdote, they genuinely are a digression. We 
had a problem, though, and that was that there was an implementation that 
encrypted using Blowfish even when someone’s key didn’t have Blowfish in their 
symmetric cipher preferences.

This is a violation of the protocol. It’s a bug. The protocol states that you 
are not to encrypt using a cipher that is not on the recipient’s preference 
list. Here’s where you should notice the similarity to the sign-only public 
key.l In each case, the recipient has made statements to the sender and the 
sender botched them.

Our problem was that our users were getting messages that they couldn’t decrypt 
and they were justifiably irked. Our initial answer was the same as what this 
discussion advocated — passing the blame on to the other software. In each 
case, as well, the implementation that provides the bad experience is in the 
right. The other side is broken. However, it was *our* users who were getting 
the bad experience and the other people didn’t want to fix it.

What we did was to get off our our high horse just a little and do the right 
thing for our users. We put Blowfish into PGP, but only on the decrypt path. It 
was never in the UI, we never encrypted to Blowfish. (For all of the wacky, 
downright stupid reasons that are orthogonal to this story.) In the case where 
someone else’s implementation was broken, we did the right thing for our users. 
We decrypted the message, and showed it to them as if everything was done 
correctly, despite the fact that someone else had a botched implementation, and 
arguably we were botching our own.

Our allegiance was to our users. That is good software engineering, in my 
opinion. We also went behind the scenes and got the other people to do the 
right thing, but it took a while, and even after the other implementation was 
fixed, the latent issue still hung around for a long, long time until all the 
other users upgraded.

I know what some people are thinking, they want to come up with some contrived 
example in which if this had happened or that had happened, it would be bad. 
They’re right that if it was bad, it would have been bad. It’s always true that 
bad things are bad. But sometimes, ugly things are less bad than the pretty 
ones. In the case of our Blowfish example, or this one with sign-only keys, it 
doesn’t *hurt* anything to make a good experience. It doesn’t ruin security. It 
doesn’t give wrong answers. (To the contrary, it gives *right* answers.) Yeah, 
it’s ugly. It’s inelegant. If you’d prefer that I characterize it as the “least 
bad” answer as opposed to a right answer, sure. I’ll concede that. Ultimately, 
I care about good experiences for the user over purity.

However, I also have a huge live and let live streak as well. If you disagree, 
and you want to do *your* implementation so that when your users say, “WTF” the 
answer is, “That’s what the standard says,” then more power to you! It’s your 
gun and your users’ foot, to my mind.

What I object to, and the whole reason I started this is that I object to 
changing the standard so that it says I should shoot my users in the foot. 
That’s the real point. I’m all in favor of the standard saying, “Look here, 
doofus, don’t encrypt to sign-only keys!” I am not in favor of it saying, “When 
some doofus has encrypted to your sign-only key, OMG, whatever you do don’t 
decrypt it!”

You talk below about the health of the ecosystem, and this is where we agree. 
What we apparently disagree on to my mind is how this case is handled. Remember 
one again that this starts with a broken implementation that was done by 
someone else. It isn’t, in my opinion, about Postel’s principle. My 
interpretation of his principle is about how you handle gray areas, or the 
inevitable case where reasonable people might disagree about an interpretation. 
This is a case where the other implementation is flat-out wrong. My principle 
is not Postel’s. My principle is to do what’s right for the user. In these two 
cases, the right thing for the user is to decrypt the message. In other cases, 
it’s going to be to the opposite. (If this were, for example, about someone who 
botched an MDC, you’d be hearing a very different thing from me, as that can 
hurt the user.) It sounds to me like yours is to follow the spec. I also think 
that you believe that there is some greater good in!
  following the spec. I confess that with me, it’s far more whether I as a user 
would rage-quit the software and just stop using it.

A few more inline comments below.

OpenPGP oughta say that an implementation MUST NOT encrypt to a sign-only 
key,
but it should leave it at that.


I believe that this all goes back to Postel's law of "conservative in what you
emit, liberal in what you accept". You're advocating for being "resilient" in
the face of bugs in implementations, and that sounds like a good idea on 
paper.


Again, no, this isn’t Postel’s principle. This is about what’s right for the 
user. It’s also about freedom to implement.

I think it’s fine for you to be strict, what I object to is the meta-point, 
that there are no other reasonable interpretations.


It is at this point terribly hard and time consuming to write an OpenPGP
implementation that works interoperably, because of a general expectation that
everyone be 100% GnuPG bug compatible.


Then those people are wrong.

As I mentioned above, PGP was not 100% compatible with GnuPG. It didn’t 
implement Blowfish. I’ve been involved in other implementations where we 
implemented a subset of all the possibilities of OpenPGP.

The most problematic part of subsetting OpenPGP is dealing with compression, 
but even that, if you make your keys correctly will work just fine. You can 
make an OpenPGP implementation that just does a very few things.

It's just a blip on the radar, but the
case described above happened five years into working on OpenKeychain. And 
it's
not a fluke, I have more similar incidents (will post about them soon).


Sure. People screw up implementations all the time. So what?

Repeating myself, if your software gives your users a bad experience, they’ll 
find other software. That software might be OpenPGP compliant, but likely 
they’ll say something like, “Why are you using OpenPGP when you could be using 
Signal?"

The standard should not forbid resiliency in the face of a bug.


This is a very reasonable sentiment for specs like HTML, we certainly wouldn't
want to outright reject a website just because of a missing </i>. But in the
context of a cryptographic protocol, this is super dangerous.


Why? You assert that, but give no evidence. There is no cryptographic problem 
here. It’s a policy problem.

Being overly relaxed in what we accept means giving attackers a large amount 
of
wiggling room. This is exactly what brought us EFAIL. We should learn from 
that,
and I hope we can do better in the future.


You used the weasel word “overly” and I think that’s the crux. The debate we’re 
having is over what “overly” means. And frankly, this has nothing to do with 
EFAIL. But that’s another discussion. I would be happy to talk about EFAIL, but 
it’s irrelevant to this discussion.

        Jon

_______________________________________________
openpgp mailing list
openpgp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/openpgp