Re: PGP Message Exchange Formats draft-ietf-pgp-formats00.txt

On 7/31/97 02:02, jude shabry <jude(_at_)pgp(_dot_)com> wrote:

INTERNET-DRAFT                                               Gail Haspert
Security                                                           PGP, Inc.
Expires in six months                                  Editor
                                                                         July 
30 1997

PGP Message Exchange Formats
<draft-ietf-pgp-formats00.txt>


Some comments on the contents, syntax and semantcs of the aforementined 
document.

Regards
-- Gerald

-BEGIN-

2.4 Radix-64 conversion

 ...converting
 the raw 8-bit binary octet stream to a stream of printable ASCII
 characters, called Radix-64 encoding or ASCII Armor.


Isn't the term "octet" a synonym for 8-bit data entities, defined as such in 
RFC ????

"ASCII characters" are really "7-bit US-ASCII octets"

 Similar to MIME encoding, the process of Radix-64 encoding an


MIME defines several encodings, use an explicit reference to "base64" encoding
(RFC ???? Section ?.?)

 ...8-bit binary stream of data...


"binary octet stream"

 is as follows. Every three subsequent 8 bit binary octets...


simply "octets" instead of "8 bit binary octets" (8-bit binary octets).

 are mapped into four subsequent 6-bit ASCII characters.


what are 6-bit ASCII characters? use "US-ASCII". "6-bit coding subset of 
US-ASCII"

In general the process is already defined and phrased in some MIME documents, so
that wording SHOULD be used for consistency IMHO.

 Radix-64 encoding also appends a CRC to detect transmission errors.


This really is a difference from base64.

How is padding handled? This is explicitely stated in base64 specs.

 This radix-64 conversion, is a wrapper around the binary PGP
 messages, and is used to protect the binary messages during
 transmission over non-binary channels, such as Internet Email.


How does this interoperate with 8bitMIME capable ESMTP mailers? Explicite 
reference.

 The following table defines the mapping.  The characters used are the
 upper- and lower-case letters, the digits 0 through 9, and the
 characters + and /.  The carriage-return and linefeed characters
 aren't used in the conversion, nor is the tab or any other character
 that might be altered by the mail system. The result is a text file
 that is "immune" to the modifications inflicted by mail systems.


Is this anyhow different to MIME base64 encoding?

 To encode an arbitrary 3 octets, separate the bit stream logically


how is an octet transformed into an bit stream, which bit is first...?

 into four 6-bit groups, calculate the decimal value of the 6-bit


"decimal value" ??? decimal is because of the table below, but not
required by the encoding process? skip "decimal", use "use as an index".

 group, and replace it with the corresponding character from the table
 below.

 Decoding is the exact opposite operation.


Really? Nothing more :) This is a to-be standard, defining things not merely
stating them to happen...

Add one or more paragraphs on decoding...

                 6-bit Value Mapped to Character Encoding


See? no "decimal" mentioned in the table heading

          Hex octet           0x17      0x3A      0xA3
Example:   Binary stream      00010111  00111010  10100011
          Translate to 6bit  000101  110011  101010  100011
          Decimal              5      51      42      35
         Final Character       F       z       q       j


OK, this serves as example and definition on bit-streaming...

But, what to to if there are not enough octets as input, how to pad?
And how to detect the end of the to-be-decoded stream, as there is no
length field anywhere...

 It is possible to use PGP to convert any arbitrary file to ASCII
 Armor.  When this is done, PGP tries to compress the data before it
 is converted to Radix-64.


Terms "ASCII Armor" and "Radix-64" are used interchangable, by intention.

What about MIME base64 content encoding of "binary" PGP message data?

binary data --> ASCII Armor (as per 2.4.1 below) --> compress ? -->
Radix-64 ---> MIME...

2.4.1 ASCII Armor Formats

 When PGP encodes data into ASCII Armor, it puts specific headers
 around the data, so PGP can reconstruct the data at a future time.
 PGP tries to inform the user what kind of data is encoded in the
 ASCII armor through the use of the headers.

 ASCII Armor is created by concatenating the following data:


ok, we are now creating ASCII Armor...


      - An Armor Headerline, appropriate for the type of data
      - Armor Headers
      - A blank line


What is a line, BTW. Obviously now we need an RFC822 type definition
of "line" (sequence of octets terminated by an <CR><LF> sequence?)

      - The ASCII-Armored data


Oops. This is an recursive definition, as we are about to specify how
ASCII-Armored data is created. Obviously this should read
"The Radix-64 encoded data"

      - An Armor Checksum


Same as above, should read "An Radix-64 Checksum of the encoded data".

      - The Armor Tail, which depends on the Armor Headerline.

 An Armor Headerline is composed by taking the appropriate headerline
 text surrounded by five (5) dashes (-) on either side of the


(-) --> (US-ASCII 0x?? "-")

 headerline text.  The headerline text is chosen based upon the type
 of data that is being encoded in Armor, and how it is being encoded.


Is it "Armor" or "ASCII Armor" or "ASCII-Armor" or what?

 Headerline texts include the following strings:


Is an enumeration to follow or an example?

  BEGIN PGP MESSAGE -- used for signed, encrypted, or compressed files
  BEGIN PGP PUBLIC KEY BLOCK -- used for transferring public keys
  BEGIN PGP MESSAGE, PART X/Y -- used for multi-part messages, where
                                  the armor is split amongst Y files,
                                  and this is the Xth file out of Y.


Use a tabular approach here, like:

    "BEGIN PGP MESSAGE"            used for signed, encrypted, or
                                   compressed files

    "BEGIN PGP PUBLIC KEY BLOCK"   used for transferring public keys

    "BEGIN PGP MESSAGE, PART X/Y"  used for multi-part messages, where
                                   the armor is split amongst Y files,
                                   and this is the Xth file out of Y.

Make clear whether this is an exhaustive list ("definition") or only
examples.

How are new value strings introduced, and what to to with
undefined/unkown strings.

 The Armor Headers are pairs of strings that can give the user or the
 receiving PGP message block some information about how to decode or
 use the message.


Strange concept, however :)

 ...The Armor Headers are a part of the armor, not a
 part of the message, and hence should not be used to convey any
 important information, since they can be changed in transport.


How could this happen, they are to contain base64/radix-64 characters
considered to be mail save. Any other encoding of the "ASCII Armor" is
to be transparent and is removed/reversed before the ASCII Armor is
passed to the PGP layer.

What about BINARY Armor? Maybe introduce something streamable like GIF or PNG.

 The format of an Armor Header is that of a key-value pair.  PGP
 should consider improperly formatted Armor Headers to be corruption
 of the ASCII Armor.  Unknown keys should be reported to the user, but
 PGP should continue to process the message. Currently defined Armor
 Header Keys include "Version" and "Comment," which define the PGP
 Version used to encode the message and a user-defined comment.


This paragraph obviously deals with the above mentioned "Armor Headerlines",
not "Armor Headers".

Armor Headerlines are to be built after RFC822 lines, thus possibly adopting
the folding and encoding (Q-encoding) rules defined there and elsewhere.

Thus Headerlines are just like "Subject:" or "Comment:" header lines of email
messages.

The Armor Checksum is a 24-bit CRC converted to four characters of
radix-64 encoding, prepending an equal-sign (=) to the four character code.


"... to the resulting four characters."

The CRC is computed by using the generator 0x864CFB and an initialization
of 0xB704CE.  The accumulation is done on the data before it is converted
to radix-64, rather than on the converted data.  For more information on
CRC functions, the reader is asked to look at chapter 19 of the book
"C Programmer's Guide to Serial Communications," by Joe Campbell.


This complete para above is not intented properly.

Got the ISBN of that book. A References section is missing from the draft.

 The Armor Tail is composed in the same manner as the Armor
 Headerline, except the string "BEGIN" is replaced by the string
 "END."


Its "Headerline", but "Tail"; maybe use "Tail line" and "Header line".

An example of a short message in ASCII Armor is missing here now.

3. Data Element Formats
This section describes the data elements used by PGP.


An empty line is missing as seperator.

3.1 Scalar numbers

Scalar numbers are unsigned, and are always stored in big-endian format.
Thus, the value of a two-octet scalar is ((n[0] << 8) + n[1]). The value
of a four-octet scalar is ((n[0] << 24) + (n[1] << 16) + (n[2] << 8) + n[3]).


Add example.

The string of octets [00 01 01] forms an MPI that has a value of 1. The string
[00 09 01 ff] forms an MPI with the value of 511.


reformat into seperate paragraphs and use uppercase characters for hex digits 
A..F.

--> The string of octets [00 01 01] forms a MPI that has a value of 1.
-->
--> The string of octets [00 09 01 FF] forms a MPI with the value of 511.

Hmm, thus MPI's are quite different from Radix-64 bit streams, as the bit stream
is aligned differently. But:

Additional rules:

The size of an MPI is ((MPI.length + 7) / 8) + 2.


--> The size of a MPI in octets is ((MPI.length + 7) / 8) + 2.
                ==    =========

The length field of an MPI describes the length starting from its most
significant non-zero bit. Thus, the MPI [00 02 01] is not formed correctly.
It should be [00 01 01].


This clears things up a bit, but this should be placed before the Example 
anyway.

Is it "an MPI" or "a MPI" ?!

3.3 Strings

A string consists of a one-octet length and then N octets of string data.


"an (!) one-octet", better "an one-octet scalar number specifying the length,
followed by that number of octets of string data."

Add an example, say "PGP".

3.4 Time fields

A time field is a four-octet number containing the number of seconds
elapsed since midnight, 1 January 1970 GMT.


"four-octet scalar number". Is this the UNIX epoch...

Give one example (maybe 1 January 2000 :) ) and give the maximum date/time 
possible
(not taking leap seconds and such into account...:) )

4.2 Packet Headers

The first octet of the packet header is called the "Cipher Type Byte."


"Byte" is ambiguous?! Define "ctb" or "CTB" used below.

This first octet is also the first octet of the entire packet, or else?

And, packets headers are obviously not covered by section 3, as they are not
defined there (an design flaw of early times, I guess).

Are "Tags" (as mentioned in 4.1 Overview) always single octets, then they can
be defined in section 3.

In an old format packet, the packet tag resides in the middle four bits
((ctb & 0x3c) >> 2). The two low order bits (ctb & 0x3) denote the number


use "0x3C"

of octets following the CTB that hold the length of the packet body.


"length" --> "length-type"

Thus with ((ctb & 0xFC) >> 2) --> 64..127 for old formats...

The meaning of the length-type is:

0 - The packet has a one-octet length. The header is 2 octets long.
1 - The packet has a two-octet length. The header is 3 octets long.
2 - The packet has a four-octet length. The header is 5 octets long.
3 - The packet is of indeterminate length. The header is 1 byte long,
and the application must determine how long the packet is. If the packet
is in a file, this means that the packet extends until the end of the file.
In general, an application should not use indeterminate length packets.


Use table format:

  0   The packet has a one-octet length. The header is 2 octets long.

  1   The packet has a two-octet length. The header is 3 octets long.

  2   The packet has a four-octet length. The header is 5 octets long.

  3   The packet is of indeterminate length. The header is 1 byte long,
      and the application must determine how long the packet is. If the packet
      is in a file, this means that the packet extends until the end of the
      file.

  In general, an application should not use indeterminate length packets.

In a new format packet, the low-order six bits (cth & 0x3f) form the packet
tag.


The above text on old format packets used "ctb"?! 0x3F

If the next octet is 191 or less, then that is the length of the packet


Use "second octet".

body. If that octet is between 192 and 223, then the length of the packet
body is held in the two octets following the CTB with the following formula:


Another design flaw with v5.0: why not use MPI's here too.

Start a new para for the 192 to 223 case and maybe use also "second octet".

Rephrase to "high-order bit clear" type of description. Maybe use hex numbers
here too. "CTB" used here again, should be "ctb"?

  bodyLen = ((p[1] - 192) * 256) + p[2] + 192;


what is p obviously pointing to?

Note that this yields a maximum packet body length of 8383 octets for this
type of packet.

If the second octet is 224 or greater (note that 224 is E0 in hexadecimal, or


"If that octet is between 224 and 255...", "0xE0" (you already used the
uppercase here)

11100000 in binary), then the low-order five bits form the length of the
packet body with the formula:

  bodyLen = 1 << (p[1] & 0x1f);

In English, the packet body length is a power of two. Its maximum length
is 2**31 octets long. Note that there is no way to form a new-format packet
that is longer than 8383 octets and not a power of two in length.

Please note that in all of these explanations, the total length of the
packet is the length of the header plus the length of the body.

4.3 Packet Tags

The packet tag denotes what type of packet the body holds. Note that old
format packets can only have tags less than 16, whereas new format packets
can have tags as great as 63. The defined tags are:

0  -- Reserved. A packet must not have a tag with this value.
1  -- Encrypted Session Key Packet


This table uses yet another format... unify.
Use Obsolete and Reserved and New as First descriptive element, maybe form
another column.

What to to with undefined packet tags encountered?

Packet Tag 16 is a new (v5.0) packet type?!

5.0 Packet Types

This section describes some of the more interesting packet types.


Rephrase. This is the definition document, should describe all packet types?!

5.1 Encrypted Session Key Packets


Add packet tag value to the heading: "5.1 Encrypted Session Key Packets (Packet
Tag 1)"

The body of this packet consists of:

      - A one-octet number giving the version number of the packet type,
       either 2 or 3. All modern versions of PGP generate type 3
       packets.


Give a complete desciption (is 0 valid, what about 1, how to decode...)
"All modern versions of PGP MUST generate type 3 packets and SHOULD decode type
2 packets."

- An eight-octet number that gives the ID of the public key that
 the session key is encrypted to.
      - A one-octet number giving the public-key algorithm used.


Definition of values is where?

- A string of octets that is the encrypted session key. This
 string takes up the remainder of the packet, but is not
 explicitly counted.


"...and is thus not explicitely counted."

Intention is off...
use "scalar" instead of "number" to keep term consistent in this document.

5.2 Conventional Encrypted Session-Key Packets


"5.2 Conventional Encrypted Session-Key Packets (Packet Tag 3)"
What about tag 2?

This packet (tag 3) describes a session key that has been encrypted to a


omit "(tag 3)"

secondary symmetric cipher. This allows a message to be encrypted to a
number of public keys, and also to one or more pass phrases. This packet
type is new, and is not generated by PGP 2.x or PGP 5.0.


What about PGP 1.x or 0.x?

The body of this packet consists of:
      - A one-octet version number. The current version is 4.


Other versions, MUST, SHOULD... as above.

      - A one-octet number describing the symmetric algorithm used.


Definitions...where?

      - A "string-to-key" object. This is described below. It is 2, 10,
       or 11 octets long.
      - The encrypted session key itself, which is decrypted with the
       string-to-key object.


Table Intention is off...

The string-to-key object has this format:

<To be written in a future draft.>


Sure :)


For other packet tags formattings needs to be chacket, headers should add the 
tag value. and missing descriptions and value definitions given

5.9 User-Name Packet

A user-name packet is a counted string with a one-octet length. By
convention, it is an RFC 822 mail name, but there are no restrictions on its

content.


add tag value; what is "a (!) RFC mail name". Give an example. Possibly the 
mailbox name is meant (as in a To:,CC:,BCC: spec. Or "e-mail address"?

6. Constants

This section describes the constants used in PGP Version 5.x.


What about compatibility with previous (and future) versions?!

Following tables need to be adapted to document standard.

7. Transferable Public Keys

  Public keys may transferred between PGP users. The essential elements
  of a transferable public key are:


Intention is off..

8.1 Key Structures

Each signature certifies the RSA public key and the preceding UserID.
The RSA public key can have many UserIDs and each UserID can have
many signatures.


Use either "User ID" or "UserID" for consistency.

8.2 Public Key Certificate Packet Structure

To accommodate the addition of the subkey construction, a new
version of the public-key-certificate packet was created.  Packets
with a version of '3' are still created for PGP keys that use RSA


use either "version 3" or "version '3'" cnsistently.

public keys.  For specifics on version 3 packets, see RFC 1991.
The terms and formatting used here are consistent with that RFC.


"with that RFC"--> "with that document"

RSA Public Key Certificate Packet ("old format")

   a) packet structure field with CTB bits 5-2 = 0110 (2 or 3 bytes);
   b) version number = 3 (1 byte);
   c) time stamp of key creation (4 bytes);
   d) validity period in days (0 means forever) (2 bytes);


This table should be formatted according to previous table samples.

And use octets instead of bytes, and two-octets and so on...

8.7 Future Formats

Additional information on PGP formats is being provided to the public in
order for others to build compatible or complimentary systems that
support PGP.


Eh?

And:

Add reference/bibliography and how to reach the author/working group....

-END-
?