ietf-openpgp
[Top] [All Lists]

Re: PGP/MIME implementors: text mode vs. binary mode?

2001-02-13 12:52:53
On 2001-02-13 11:07:30 -0800, hal(_at_)finney(_dot_)org wrote:

The signature block at the bottom of the PGP/MIME signed message
has a signature type byte in it, text vs binary.  Is it your
assumption that the value in this type byte should control the
hashing of the textual data which is earlier in the message?

Yes, of course.

I don't know if this has to be true, but it may be helpful for
implementations if it is true.

If I sign a document in binary-mode, I want to be informed if
someone changes white space, or if line ends change. That is, in my
understanding, the type byte in the signature must mandate the kind
of hashing to be done - otherwise, it wouldn't terribly useful.

In that case there are two possibilities.  One is to mandate the
value in this type byte, and thereby mandate the hashing which is
done in the message.  Receivers would hash the message data
according to the spec, and then when they came to the type byte,
they could either ignore it or check it and complain if it is the
wrong value.

Right.

The second possibility is to allow either value in this type
byte, and thereby require that the receiver read the signature
before it goes back and hashes the data (or else hash the data
both ways).

Are these the alternatives as you see them?

Basically, yes.  Hashing the data in different ways doesn't look
very elegant, and going back to the data is not an option, either
(at least in theory).

In order to avoid these two approaches, we either have to mandate
the kind hash to be taken, or we have to make sure that both ways of
taking hashes are identical on the data to be considered.


Symbolically: Take the message m, and let || denote concatenation.
In this case, a hash used in a binary signature is just this:

(b)     hash-function (m || 
                some-properties (signature packet)),

A hash used in a text signature, however, is:

(t)     hash-function (strip-whitespace-and-canonicalize (m) ||
                some-properties (signature packet))

My approach was to mandate use of either (b) or (t). Bodo's
(better!) approach is that m is invariant under
strip-whitespace-and-canonicalize, i.e.,

        m = strip-whitespace-and-canonicalize (m),

by choosing m appropriately; this is the same basic approach which
is also used in RFC 2015.

(However, there it's not strip-whitespace-and-canonicalize, but just
canonicalize.  Also, the same canonicalization as everywhere else in
MIME world can be applied, which makes RFC 2015 more elegant than
its successor.)

Finally, since hash-function (a, b) can also be written in the form

        hash-function' (hash-function'' (a), b),

implementations can just calculate hash-function'' (m), and put in
some-properties (signature packet) when they encounter the
signature.

-- 
Thomas Roessler                     <roessler(_at_)does-not-exist(_dot_)org>