Re: [openpgp] review of the SOP draft

I've incorporated some of your suggestions from this e-mail directly in
the draft now.  Thanks a lot for them!  more discussion below…

On Tue 2019-11-12 12:26:00 -0500, Antoine Beaupré wrote:

I guess what I'm wondering is how I would make this work with my yubikey
at all. Or maybe I got this backwards and the yubikey interface is what
should implement sop directly?


I have no idea about how to make it work with hardware tokens.  I remain
frankly unconvinced about the seurity tradeoffs for most uses of
hardware tokens (something i guess i need to actually write down more
formally), and i'm unlikely to spend a lot of time developing how to
integrate them with `sop`.

I would only assume that some `sop` implementation would choose to carve
out `@MYHSM:xxx` as an input space for `KEY`-style inputs, where `xxx`
is some form of addressing scheme that indicates which secret key on
which device should be used to interact with the device.

But sorting out the addressing scheme alone is problematic enough don't
plan to incorporate any of that detail in this draft.  A well-formed,
thorough yet compact merge request might convince me otherwise, but
color me skeptical at the moment.

I find those examples confusing. Multiple arguments, in particular,
seems ambiguous. Is it "CERT DATA"? or "CERT DATA"?


???  i think those are the same thing, but i'll just assume you meant
"DATA CERTS" at the end.  The answer is that there must be exactly one
SIGNATURE object to verify and there may be multiple certs, so the only
possible way to do it is SIGNATURE first, then CERTS.


Ah, yes, sorry about this. I was specifically refering to:

    sop verify announcement.txt.asc alice.pgp < announcement.txt

And `"CERT DATA" or "DATA CERT"`?

And I guess where we differ is I am not sure it's that clear that the
first argument of a series can be different from the rest...


well, the middle arguments of a series are *definitely* hard to
distinguish, so the only plausible distinctions when you've got
one-vs-many positional arguments is whether the "one" goes first or
last.  But let's follow up on that over at

    https://gitlab.com/dkg/openpgp-stateless-cli/issues/7

and in particular, on your ongoing merge request at:

    https://gitlab.com/dkg/openpgp-stateless-cli/merge_requests/13

If anyone else has strong feelings about this choice, please take a look
over there and follow up, either here on list, or on those tickets.

How do we generate purpose-specific subkeys?


With `sop`, you do not ;)


Sad.


I be

If you want to do fancy OpenPGP certificate generation, you do that with
your toolkit's own fancy features.

I've opened https://gitlab.com/dkg/openpgp-stateless-cli/issues/2 to
track that maybe we do want some rough guidance about what kinds of
secret key capabilities we want any `sop` to be able to generate here
though.


Commented on that. Would still love to see a more decent way to handle
subkeys because that's a really hard thing to do in existing
implementations.

At least creating split subkeys by default would be a great start, IMHO.


This is exactly the sort of decision that i want to see implementers
make, so we can document their choices.  `sop` is not about fancy key
management, nor should it be.  As i wrote in
https://gitlab.com/dkg/openpgp-stateless-cli/issues/2, i do not want
`sop` to place any detailed constraints here, i just want the generated
key to be functional for use with `sop`.

We don't mandate UTF-8 unless the signer claims that the thing being
signed is text.  If so, it really does need to be UTF-8.  I have no
patience for non-UTF-8-encoded text in 2019.

OpenPGP embeds UTF-8 explicitly in its User ID formatting.  Any OpenPGP
implementation must already handle UTF-8.

if anyone thinks that dealing with different character encodings is a
good idea, please consider that the character encoding is not recorded
in the signature itself, leading charset-switching attacks like those in
https://dkg.fifthhorseman.net/notes/inline-pgp-harmful/

Do you think this information belongs in this document?


Absolutely, otherwise it looks like an arbitrary decision.


I've just added the following subsection with "Guidance for
Implementers":

    Text is always UTF-8 {#utf8}
    --------------------

    Various places in this specification require UTF-8 {{RFC3629}} when 
encoding text. `sop` implementations SHOULD NOT consider textual data in any 
other character encoding.

    OpenPGP Implementations MUST already handle UTF-8, because various parts of 
{{RFC4880}} require it, including:

     - User ID
     - Notation name
     - Reason for revocation
     - ASCII-armor Comment: header

    Dealing with messages in other charsets leads to weird security failures 
like {{Charset-Switching}}, especially when the charset indication is not 
covered by any sort of cryptographic integrity check.
    Restricting textual data to `UTF-8` universally across the OpenPGP 
ecosystem eliminates any such risk without losing functionality, since `UTF-8` 
can encode all known characters.


If any thinks that's either wrong or insufficient, please send
corrections/improvements!

I wish we didn't have to deal with this distinction, but if so, maybe we
should clarify the source of it here. Otherwise it comes as a surprise
to me, an experience OpenPGP user.


As a user, you shouldn't ever need to see it.  As an implementer, you
do need to think about it.

I've added the following text to the discussion of `sop sign`:

    `--as=binary` SHOULD result in an OpenPGP signature of type 0x00 
("Signature of a binary document").
    `--as=text` SHOULD result in an OpenPGP signature of type 0x01 ("Signature 
of a canonical text document").
    See section 5.2.1 of {{RFC4880}} for more details.

And i've added a new subsection in "Guidance for Conumers":

    Choosing between `--as=text`  and `--as=binary`
    ------------------------------------------------------

    A program that invokes `sop` to generate an OpenPGP signature typically 
needs to decide whether it is making a text or binary signature.

    By default, `sop` will make a binary signature.
    The caller of `sop sign` should choose `--as=text` only when it knows that:
     - the data being signed is in fact textual, and encoded in `UTF-8`, and
     - the signed data might be transmitted to the recipient (the verifier of 
the signature) over a channel that has the propensity to transform line-endings.

    Examples of such channels include FTP ({{RFC959}}) and SMTP ({{RFC5321}}).

What I'm saying is the `sop sign` example is error prone. Forget the `<`
and the mandated order and you might reverse the signing key and the
message.


sure. if you screw up any API, you can screw up any API :)

If `sop decrypt` fails for any reason and the identified 
`--session-key-out`
file already exists in the filesystem, the file will be unlinked.

 
This seems dangerous! Why do we delete a file we haven't created?
Explain.


We don't want the user to run `sop`, and then inspect a file that was
already in the filesystem thinking that it is `sop`s output.  If you
think that's a bad decision, please suggest what we should do
differently.


Maybe we should not overwrite existing files at all and fail earlier?


I think you're proposing that if the `--sessionkey-out` file already
exists in the filesystem, that should be an error in the first place.
I'd be happy to entertain that idea, if anyone wants to provide text for
it.

If you decide to try to write it up, please think about how it works for
the other scenarios where `sop` can produce output on more than stdout.
it would be nice if these mechanisms all had the same behavior.

[`--with-session-key`] enables decryption of the `CIPHERTEXT` using the 
session key directly against the `SEIPD` packet.
This option can be used multiple times if several possible session keys 
should be tried.


What happens if both "in" and "out" are provided? I can venture a guess,
but it would be important to make that explicit as there can be horrible
bugs there.


Please do venture a guess, in the form of proposed text! I'd also love
to hear what the horrible bugs are.  I don't see them.


I would argue that both options should not be provided at once. One
implementation that could come up would be that the program attempts to
read the file as it's writing it, truncating the precious key before it
has time to read it.


Ah, you're not talking about providing both options -- you're talking
about providing both options pointing *at the same file*.  i agree, that
sounds like a bad idea, but it's a bad idea for *any* pair of input and
output fields.

We can continue the discussion in issue #13, but the TL;DR: is that I
agree that stripping trailing control characters is a good idea, but
disagree about whitespace in general.


I hope other folks will weigh in on #13.  There's interesting discussion
going on there about what properties it's reasonable to expect from a
"well-formed" password.

I don't know how OpenPGP packets are built. Can't we show the signature
on the output of decrypt?


Absolutely not.  Mixing the cleartext output with the signature
verification stream is a classic cause of failures.  What if the
cleartext data happens to "look like" a signature verification?  how is
the consumer supposed to distinguish between them?

It is critical to keep them separate.

But if the primary operation is decryption, i don't think we should fail
on signature validity for reasons outlined above.


But that assumes decryption is the primary operation.


The subcommand is "sop decrypt".  By definition, "decrypt" is the
primary operation.

In the context where all my email traffic is encrypted with OpenPGP,
for example, decryption is not the primary operation anymore. I *do*
want to fail properly on signature validity, it becomes a primary
operation when encryption is "default"...


You want a *failure* in the sense that you think that an MUA shouldn't
show the user the cleartext of the message if no valid signature can be
found?

This is suprising to me, and i know of no MUA that does this.

File descriptors could be passable as distinct options, like
--sign-with-fd for --sign-with.


This is an interesting proposal, though i don't see how --sign-with=@FD:3
is much different from --sign-with-fd=3  -- i guess it lets you use
files that are literally named @FD:3 ?  Is that important?


It's less magic, more explicit, and correlates better with other
commandline APIs I have encountered.


it looks to me like it would make the description of the command line
significantly more verbose, but i'm willing to consider it if someone
wants to propose a specific textual change.

Say you think you are in a trusted directory with "CERTS" that you want
to encrypt to. You call:

  sop encrypt * < /tmp/file > /tmp/file.pgp

Except you made a mistake and the attacker has control of the current
directory, and injects a file named (say) @ENV:SOMETHING. Assuming they
have control over the SOMETHING environment, they can now add an
encryption key to the message.


if the attacker has control of the directory, they can inject an
encryption key in the first place, right?  i don't think @ENV makes this
any worse...

Control of the environment is kind of a stretch, I must admit, but in
certain environments (most notably web servers), a *lot* of stuff can
end up there and it shouldn't be completely trusted this way.


perhaps when an `@`-prefixed argument is supplied, if a file with a
matching literal name exists, `sop` should fail with an error because of
the ambiguity?  This seems like an unlikely and unusual situation, but i
can see how it might be worth thinking about it.

Perhaps it would be interesting to contrast a MR that contains this
guidance with a MR that switches over to the --with-$foo-fd= approach
you've suggested.

patches welcome, particularly for this kind of editorial cleanup :)


https://gitlab.com/dkg/openpgp-stateless-cli/merge_requests/12


thanks, merged ;)

It would also be great if we could explain where those magic numbers
come from in the first place. I suspect they were chosen to not overlap
with existing error codes, but that's just a guess.


Justus picked 69 in his OpenPGP Interoperability Test Suite.  I chose
the others as "reasonable-sized primes" just for fun.  I don't think
this information belongs in this document, as it doesn't matter.


I love this kind of information in text, it makes it less dull. :p


i think we have a difference of opinion here.  maybe this is ok in an
acknowlegements section, but i definitely don't want to read discursive
stories when i'm trying to extract technical information.

If you want to supply a patch for the acknowledgements section, i'd be
willing to consider merging it.

`sop probe` would do the minimal amount of work required to determine
which keys ("signers") to consider  when decrypting, then call `decrypt`
properly.


I don't think this is "stateless", and i don't think it's
well-specified.

I also don't think it's particularly useful to know *that* a thing was
signed (by some arbitrary certificates) if you haven't already made some
determination that the certificate is meaningful in the current context.

`sop probe` could also do the general task of parsing OpenPGP messages
into packets and stuff like that.


this doesn't sound like a subcommand at the same level of abstraction
that `sop` is aiming for.

So i'm still not convinced.  But if you (or anyone) wants to make a
merge request that proposes new subcommands, i'm definitely up for
reviewing them.

Compression {#compression}

[…]

How about decryption? Do we attempt decompression during decrypt?


It will be interesting to see what implementers do!  I've left `sop`
deliberately agnostic there, and i would like to learn from test suites
what the answer is.


Should we make that decision clearer in the document?


it's not really a decision :) Hopefully my description of how it would
work as part of an interop suite will give a hint of this kind of
approach.

Thanks for the discussion!

        --dkg

signature.asc
Description: PGP signature

_______________________________________________
openpgp mailing list
openpgp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/openpgp