yOn Tue, 11 May 1999 hal(_at_)rain(_dot_)org wrote:
These statements are only valid in the context of your implementation.
I have implemented it in a different way which did not require adding any
new layers. I simply added calls to the hash function in the decryption
module, as well as the 20 byte holdback buffer. As it happened I
already had a buffer in the decryption module so modifying it to hold
back 20 bytes was easy. There were then two extra functions added to
the decryption module, one to finalize and retrieve the hash value,
and one to retrieve the value from the holdback buffer (which holds the
final 20 bytes at the end of processing). These are then compared at
the end of the packet. It was not a difficult change.
I would still call this a layer, even if it is inline.
Which is the most architecturally attractive way to do it? Which method
is the most logical and coherent? Which method would you use if you
were designing from scratch, rather than trying to shoehorn it into an
existing implementation so as to change it as little as possible?
Architecture attractive means different things when adding on to a
building v.s. building a completely new building.
It is not more logical or coherent to not use an existing layer that
already does hashing and returns those hash bytes (with about a dozen
extra bytes hashed).
If you are asking which method should be used if redesigning EVERYTHING
from scratch, I would give a different answer, but then why bother calling
it or otherwise associating it with OpenPGP. I could list about a half
dozen other things that should be done differently. Will you be open to
those suggestions as long as the standard for OpenPGP 2.0 will be a clean
sheet of paper?
And I am not saying it must be the signature packet or nothing. My
problem is with the 20 unencapsulated or unindicated hash bytes at the
end. Encapsulating them addresses all my major complaints. But that
would mean another new packet type or subtype.
Altering one byte to add a definition to the signature packet is less of
a hack than holding back 20 bytes. Don't assume everyone is running on
big machines with an OS or API calls to handle this and CPU cycles to
spare.
Holding back 20 bytes does not take a big machine, or any OS or API calls.
It is a simple circular buffer and about half a page of code.
Neither does it take much to reserve 20 bytes at the beginning. I would
rather burden the single encryption operation than the many decryption
operations. You then don't need any of this extra stuff in the decryption
layer to find the EOF. And most things work better on block boundaries.
Nor does it take anything more to put a CTB terminal packet at both the
end of the encryption stream and do a 20 byte packet for the hash (oh,
maybe in YOUR implementation it would be hard). And at least a byte for
the algorithm number even if it is fixed simply to be consistent.
Let's not go down the path of putting bytes outside a packet. That would
violate all of our conventions.
An implicit convention that has been true UNTIL NOW is that no unknown
variable length data element is FOLLOWED by a fixed length data element,
perhaps except if you do a signed compression packet with an old style
CTB, but you do get the end-of-packet indication at the end of the
compression stream (i.e. the compressor does know where the compressed
data ends).
Another implicit convention is that everywhere things like algorithms are
used, there is no implicit value, but an identifier. Maybe we only want
to use SHA1, but it would follow the tradition if the 20 bytes were
prefixed by a version number and the algorithm identifier.
Defining two new packet types, new encryption format and new MDC only
valid if it follows and is encrypted by the above isn't that much harder.
I have pointed out that we can do it without bloating the code. But it
seems everyone wants to bloat it. Assuming PRZ isn't doing another
"hit-and-run", what does HE think of the alternatives?
Phil is out of the country now, but I have spoken at great length to him
about this over the past month. I assure you that Phil is THE strongest
supporter of integrating the MDC feature with a new encryption packet.
I cannot over emphasize how strongly he opposes doing it as a hack of
the signature packet.
I take it he is in a country without Email? I don't care how
strongly one favors or opposes something if it is not on technical
grounds. And he has been capable of speaking for himself in the past.
Is he also completely and irrevocably opposed to having the hash as
anything except a naked 20 byte field at the end of the plaintext stream?
I showed earlier how inappropriate the signature packet is for this
purpose. I listed all the fields and showed how few of them made any
sense in this context. I can understand that this is the path of least
resistance for kludging this feature into existing code, but we would
prefer an approach which makes sense architecturally. Signature packets
are not appropriate for this purpose.
I will repeat that I am only calling it signature packet because that is
the title of the packet I want to use and would move to change it to
"validity packet" or "verification packet" in the spec. Because it has
been given one specific name doesn't restrict it from both the function
and name being generalized.
Many of the fields make no sense in other contexts as well. But having
implemented it this is what we have inside:
4 - the version (which is needed)
X - the signature type (not needed)
A - sig algorithm, 0 for MDCs
a - hash algorithm, SHA1 (needed unless we never want anything else)
four zero bytes (length of hashed and unhashed packets)
Also 0x04 0xff and 4 length bytes are hashed. It is either redundant in
the signature packet or useful, and that would extend to here.
I haven't checked the packet types to see if there is any potentially
useful information that someone might want to add (e.g. a timestamp or
comment for a file that wouldn't be part of the image if it were something
like an executable).
Then there are the two redundant bytes for the hash that still provide a
quick check, which is less useful if you dont' have to do a signature
calculation. And the hash value, but encoded in an MPI, so the MPI
length word would be redundant.
All this v.s. having a special case for the new encryption type that also
does the MDC - the special casing won't work or apply to existing
encryption algorithms, and the hashing that is already there won't apply
(or have to be done while generating a signature as well).
And, if you do create a signed message (where having a signature is
significant, for example a timestamp), why wouldn't the entire MDC
infrastructure be redundant? You are validating the decryption and then
revalidating it again when checking the signature. Shouldn't it be turned
off to avoid having this function that doesn't make sense in this context?
What added integrity does the MDC give to a signed message (and since we
are talking about elimination redundancy and extra fields, I am assuming
the key is available, etc.)? Why is the MDC not redundant since that is
the reason you are giving for not having it encapsulated in the existing
validation structure. I think there are fewer total redundant bytes.
And I have asked about existing algorithms. Whatever extra security an
MDC would provide for TwoFish should also apply to 3DES, IDEA, and CAST.
Is there a reason for NOT doing MDCs with the existing algorithms? It
might be useful to have an implementation that implements only the MUST
algorithms except to have the extra MDC function (insert all the reasons
we need an MDC and not merely an "anybody" signature key here). How does
your method address this?
Even having an MDC "packet" (those horrid extra header bytes and the
restart of the CTB packet length engine) appended to the plaintext packet
instead of being part of it would allow for this.
I am not arguing against an MDC packet, though I don't see as much problem
with the extra field bytes in the existing signature/validation packet.
They don't have to be in the existing signature packet (though I would
want things like version and algorithm bytes and length processing to make
them orthogonal with every other packet type). I am arguing against a
required trailing naked MDC field unique to a new encryption packet.
To clarify, it would be encrypt( literal(data), MDC ), MDC being the new
packet containing the hash of the literal. And encrypt( "mdc follows",
literal(data), MDC) would be even better.
Note this would also have use as a simple checksum - only detecting if the
file was corrupted in transmission for simple non-crypto type validation.
Conversely, extending the existing structure is easy enough to warrant
being a SHOULD.
As far as I know, there are three implementations that are affected:
your reference implementation, the Gnu Privacy Guard, and the commercial
PGP from Network Associates. If we can come to agreement then we can get
these changes into all of these implementations, and make it a SHOULD as
Werner suggested.
What about Geiger's OS/2 (mainly) implementation?
Does GPG have any plans on allowing MDCs with the other algorithms? Other
hash algorithms besides SHA1?
To restate my position, the MDC should be encapsulated in some kind of
packet consistent with the existing syntax and grammar. My preference is
for expanding the existing signature packet by adding a "zero" algorithm.
But I don't have any objections to a new MDC-only packet, especially if it
has a "MDC follows" prefix packet.
All I don't want is an unencapsulated fixed number of bytes implicitly at
the end of a packet.