ietf-822
[Top] [All Lists]

RE: gzip-8bit

2003-02-26 20:47:25

All true and useful background.  But do you have a view on the subject?
;-)

          - dan
--
Dan Kohn <mailto:dan(_at_)dankohn(_dot_)com>
<http://www.dankohn.com/>  <tel:+1-650-327-2600>

-----Original Message-----
From: Adam M. Costello
[mailto:ietf-822(_dot_)amc+0(_at_)nicemice(_dot_)net(_dot_)RemoveThisWord] 
Sent: Wednesday, February 26, 2003 19:28
To: IETF RFC-822 list
Subject: Re: gzip-8bit



Dan Kohn <dan(_at_)dankohn(_dot_)com> wrote:

Any view on whether it makes more sense to mandate gzip versus deflate
combined with a Content-MD5?

As far as I can imagine, the issues of interactions between gzip/deflate
and Content-MD5 would be the same as the issues of interactions between
zlib/deflate and Content-MD5.  In either case, the gzip or zlib trailer
includes a 32-bit hash of the uncompressed data, so about one in 4e9
random corruptions will go undetected.  The Content-MD5 provides an
independent 128-bit hash on the very same data.  Having both is thus
equivalent to a 160-bit hash on the uncompressed data, which would allow
about one in 1e48 random corruptions to go undetected.

The 32-bit hashes in gzip and zlib (CRC32 for gzip, Adler32 for zlib)
are not cryptographic; they are designed to catch random corruption,
not malicious corruption.  MD5 is a cryptographic hash, designed to
catch malicious corruption, but that only works if you can protect the
MD5 hash itself from malicious corruption, which is not true for the
Content-MD5 header field.  A cryptographic hash is no better for this
purpose than a non-cryptographic hash.  I assume MD5 was chosen not
because it is a cryptographic hash, but because it was the most widely
implemented greater-than-32-bit hash at the time.

CRC was designed to be fast and simple in hardware.  It can be made
reasonably fast in software using lookup tables.  Adler32 was designed
to be fast and simple in software (it's about 4 times faster).

AMC


<Prev in Thread] Current Thread [Next in Thread>