On Nov 8, 2018, at 3:24 AM, Vincent Breitmoser <look@my.amazin.horse> wrote:
Hey Jon,
thanks for your thoughtful reply!
And thank you for appreciating it.
Moreover, gnupg is not merely a tool, it’s a reference implementation and as
a reference implementation it needs to have meta-features for the purpose of
debugging.
I agree with thre premise here, but not the conclusion. In the concrete bug
report I have, a bank(!) implemented OpenPGP in a way that included two major
blunders: 1) they encrypt to a subkey that doesn't have the flag. 2) they
include a one-pass signature packet, but then no actual signature (which isn't
a valid OpenPGP message, by spec).
The reason this happened, I strongly suspect, is exactly because they treated
GnuPG as a reference implementation: they tested that it worked against GnuPG
(or some frontend), found it worked in practice (without even a warning), and
then left it at that.
I think you’re mischaracterizing my conclusion.
My point and conclusion is that you shouldn’t let people who do stupid stuff
contaminate the ecosystem.
We are starting from people who have screwed up. They screwed up because they
didn’t understand or they didn’t care. You’re not going to make them understand
by typing harder and making your keys click louder. Most importantly of all,
you’re not going to make a stupid person smart by modifying the standard. The
whole premise that we are starting from is a botched implementation.
My conclusion is that we shouldn’t make working implementations brittle in the
belief that this will make the botched implementations suddenly correct.
Now I did go off into a rant and a jeremiad, and that may not have been clear,
because good rants and jeremiads tend to wander. So let me try to be clearer
while still being entertainingly ranty.
The purpose of a standard is for interoperability. The standard describes a
language that my software and your software can use so that together we are
better than either one separately. However, the standard is a map, it is not
the territory. The territory is the software in part and the people most of all.
If the software is inferior or brain-damaged by following the standard, then
this is very bad. Sometimes this happens accidentally. It should never, ever
happen that the standard *intentionally* forgets its place in the world and
mandates bad experiences for the users. A good software developer makes good
experiences for the users. Always. If that means breaking the standard, you
break the standard. Especially when the precipitating action is that someone
*else* broke the standard.
I never invoked Postel’s principle. If you read what I wrote again, I very
carefully, *intentionally* did not invoke it. Trust me, I know what it is, and
I have also read Thomson’s I-D. He has a point, but I think that his point is
basically just the flip side of the problem. I believe that you’re bringing up
as a straw man to beat. Nonetheless, let’s examine it for a moment.
All virtues when taken too far end up being vices. Nearly every nourishing food
is bad for you if you eat too much. Nearly every medicine is a poison if you
use it improperly. Even moderation (which is what I am preaching here) can be
taken too far, and be mealy-mouthed wishy-washiness. Sometimes one has to take
a stand. The questions of course include which stand to take, how firmly one
takes it, and so on.
There are a lot of people who use (or used) Postel’s principle as an excuse and
justification for being a lazy programmer, for being a bad programmer, for
writing misfeatures for being unwilling to fix their bugs. People use it as a
way to say their shit doesn’t stink, if you’ll forgive my language. When
someone uses it as an excuse for their bad behavior, that is bad. But it is not
Postel’s principle that is bad, it is the people using it to justify bad
behavior that is bad.
Let me digress into an example. This digression is very relevant as it is
directly analogous to this situation as you will see.
For better or worse, the PGP software did not implement Blowfish. Personally, I
think it was for worse, but it doesn’t matter. The point is that we didn’t, and
while the reasons make for a fun anecdote, they genuinely are a digression. We
had a problem, though, and that was that there was an implementation that
encrypted using Blowfish even when someone’s key didn’t have Blowfish in their
symmetric cipher preferences.
This is a violation of the protocol. It’s a bug. The protocol states that you
are not to encrypt using a cipher that is not on the recipient’s preference
list. Here’s where you should notice the similarity to the sign-only public
key.l In each case, the recipient has made statements to the sender and the
sender botched them.
Our problem was that our users were getting messages that they couldn’t decrypt
and they were justifiably irked. Our initial answer was the same as what this
discussion advocated — passing the blame on to the other software. In each
case, as well, the implementation that provides the bad experience is in the
right. The other side is broken. However, it was *our* users who were getting
the bad experience and the other people didn’t want to fix it.
What we did was to get off our our high horse just a little and do the right
thing for our users. We put Blowfish into PGP, but only on the decrypt path. It
was never in the UI, we never encrypted to Blowfish. (For all of the wacky,
downright stupid reasons that are orthogonal to this story.) In the case where
someone else’s implementation was broken, we did the right thing for our users.
We decrypted the message, and showed it to them as if everything was done
correctly, despite the fact that someone else had a botched implementation, and
arguably we were botching our own.
Our allegiance was to our users. That is good software engineering, in my
opinion. We also went behind the scenes and got the other people to do the
right thing, but it took a while, and even after the other implementation was
fixed, the latent issue still hung around for a long, long time until all the
other users upgraded.
I know what some people are thinking, they want to come up with some contrived
example in which if this had happened or that had happened, it would be bad.
They’re right that if it was bad, it would have been bad. It’s always true that
bad things are bad. But sometimes, ugly things are less bad than the pretty
ones. In the case of our Blowfish example, or this one with sign-only keys, it
doesn’t *hurt* anything to make a good experience. It doesn’t ruin security. It
doesn’t give wrong answers. (To the contrary, it gives *right* answers.) Yeah,
it’s ugly. It’s inelegant. If you’d prefer that I characterize it as the “least
bad” answer as opposed to a right answer, sure. I’ll concede that. Ultimately,
I care about good experiences for the user over purity.
However, I also have a huge live and let live streak as well. If you disagree,
and you want to do *your* implementation so that when your users say, “WTF” the
answer is, “That’s what the standard says,” then more power to you! It’s your
gun and your users’ foot, to my mind.
What I object to, and the whole reason I started this is that I object to
changing the standard so that it says I should shoot my users in the foot.
That’s the real point. I’m all in favor of the standard saying, “Look here,
doofus, don’t encrypt to sign-only keys!” I am not in favor of it saying, “When
some doofus has encrypted to your sign-only key, OMG, whatever you do don’t
decrypt it!”
You talk below about the health of the ecosystem, and this is where we agree.
What we apparently disagree on to my mind is how this case is handled. Remember
one again that this starts with a broken implementation that was done by
someone else. It isn’t, in my opinion, about Postel’s principle. My
interpretation of his principle is about how you handle gray areas, or the
inevitable case where reasonable people might disagree about an interpretation.
This is a case where the other implementation is flat-out wrong. My principle
is not Postel’s. My principle is to do what’s right for the user. In these two
cases, the right thing for the user is to decrypt the message. In other cases,
it’s going to be to the opposite. (If this were, for example, about someone who
botched an MDC, you’d be hearing a very different thing from me, as that can
hurt the user.) It sounds to me like yours is to follow the spec. I also think
that you believe that there is some greater good in!
following the spec. I confess that with me, it’s far more whether I as a user
would rage-quit the software and just stop using it.
A few more inline comments below.
OpenPGP oughta say that an implementation MUST NOT encrypt to a sign-only
key,
but it should leave it at that.
I believe that this all goes back to Postel's law of "conservative in what you
emit, liberal in what you accept". You're advocating for being "resilient" in
the face of bugs in implementations, and that sounds like a good idea on
paper.
Again, no, this isn’t Postel’s principle. This is about what’s right for the
user. It’s also about freedom to implement.
I think it’s fine for you to be strict, what I object to is the meta-point,
that there are no other reasonable interpretations.
It is at this point terribly hard and time consuming to write an OpenPGP
implementation that works interoperably, because of a general expectation that
everyone be 100% GnuPG bug compatible.
Then those people are wrong.
As I mentioned above, PGP was not 100% compatible with GnuPG. It didn’t
implement Blowfish. I’ve been involved in other implementations where we
implemented a subset of all the possibilities of OpenPGP.
The most problematic part of subsetting OpenPGP is dealing with compression,
but even that, if you make your keys correctly will work just fine. You can
make an OpenPGP implementation that just does a very few things.
It's just a blip on the radar, but the
case described above happened five years into working on OpenKeychain. And
it's
not a fluke, I have more similar incidents (will post about them soon).
Sure. People screw up implementations all the time. So what?
Repeating myself, if your software gives your users a bad experience, they’ll
find other software. That software might be OpenPGP compliant, but likely
they’ll say something like, “Why are you using OpenPGP when you could be using
Signal?"
The standard should not forbid resiliency in the face of a bug.
This is a very reasonable sentiment for specs like HTML, we certainly wouldn't
want to outright reject a website just because of a missing </i>. But in the
context of a cryptographic protocol, this is super dangerous.
Why? You assert that, but give no evidence. There is no cryptographic problem
here. It’s a policy problem.
Being overly relaxed in what we accept means giving attackers a large amount
of
wiggling room. This is exactly what brought us EFAIL. We should learn from
that,
and I hope we can do better in the future.
You used the weasel word “overly” and I think that’s the crux. The debate we’re
having is over what “overly” means. And frankly, this has nothing to do with
EFAIL. But that’s another discussion. I would be happy to talk about EFAIL, but
it’s irrelevant to this discussion.
Jon
_______________________________________________
openpgp mailing list
openpgp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/openpgp