ietf-822
[Top] [All Lists]

plain-text checksums considered useless

1991-10-28 17:13:29
...How meaningful are checksums (other than advisory), for plain-text?

Even advisory, they are not particularly meaningful.  We experimented with
a checksum header in C News, but eventually decided to drop it.  There are
formidable problems in making it meaningful in an environment full of mail
systems and gateways.  The trouble is that gateways often make "harmless"
changes, like deleting trailing blanks, that break a checksum.  Although
these changes can and do cause trouble for exchanging sources, binaries,
etc., they are often harmless for plain text.  The result is a fairly high
incidence of bad checksums on more-or-less intact text.  Given this, we
could not think of anything useful to *do* with checksums on receipt.
Warnings to the user would quickly be classed as noise and ignored, because
they *are* noise most of the time.  Dropping the message is unacceptable,
because it would happen too often -- mail doesn't even have the redundant
transmission algorithm that news does.  Devising a checksum algorithm
that is invariant over all the gateway-induced transformations is very
difficult, perhaps impossible.

Checksumming something is useful only if you expect it to be transmitted
bit-for-bit intact, and will get upset if *anything* changes.  In today's
mail environment, this is fine for things that are encoded in some way
(e.g. base-64) to survive mailing, but not for unprotected plain text.
You cannot expect bit-for-bit transmission, and the changes that do get
made usually do not alter the semantic content.  Checksums should be part
of encoding schemes, not a new general-purpose mail header.

                                         Henry Spencer at U of Toronto Zoology
                                          
henry(_at_)zoo(_dot_)toronto(_dot_)edu   utzoo!henry

<Prev in Thread] Current Thread [Next in Thread>