I have been watching the messages the past few days and noting references to
checksums and I would like to try to create a summary of what I believe to
be the important elements. This, of course, represents my own opinion.
Included in the comments I make below, I state the service objective and a
suggest a simple scheme for the inclusion of a checksum.
1. I mentioned this at the end of one of my earlier notes but it is
important enough that I will mention it first this time. The objective
is to support a data integrity service in 822. I believe this is an
important and useful service, independent of support for a similar
service that may or may not appear in SMTP (see below for more on this).
One mechanism by which one realizes a data integrity service is a
checksum. However, it has also been suggested that MD4 or MD5 could be
used. These algorithms are not checksums, they are hash algorithms.
Thus, we should not be talking about checksums, but a message integrity
check (or MIC). This is the term that generically refers to both kinds
2. I would not recommend depending on a MIC service in SMTP.
a. Although the specification may exist in the short term, the deployment
of this service can never compare to the deployment of an 822
extensions user agent.
b. SMTP is point-to-point and limited in its scope. 822 mail extends
beyond the Internet, that is beyond SMTP. Providing the service in
822 guarantees an end-to-end service, which would not be possible in
c. There is potential for a non-trivial amount of processing to occur
between the receipt of a message by a user agent and its receipt by
the local MTA. While one normally does not expect that a message will
be altered in the local environment, random disk errors are not
unheard of. Supporting the service in 822, more precisely in the user
agent, provides some additional assurance at a relatively low cost.
3. A MIC calculation is completely independent of the content-transfer
encoding and the content-type. The service objective is ensure that the
content-type received is the content-type that was sent.
In part, the reason the content-transfer-encoding exists is to support
this service. An originator may know in advance to choose a lowest
common denominator representation (e.g., base64) of the message to ensure
the integrity of the message, or a gateway can alter a message's
representation based on its knowledge of the capabilities of the
A MIC can be calculated on any content-type and verified for any
content-type. A *critical* issue is whether or not the content-type can
be represented in the same form in both the originator's and recipient's
environment. This issue exists irrespective of the existence of a data
integrity service. An originator always presumes a recipient can
"receive" the message being sent. Thus, compute the MIC on the message
in its native form and send it to the recipient. The recipient can
verify the MIC and then decide what to do with the message.
I agree that text is a special case, since the conversion from ASCII to
EBCDIC and back is essentially broken in the general case. I sense there
is not much concern about this issue in particular. Rather, folks are
more concerned with a MIC on content-types other than TEXT. With this in
mind, I suggest we do not resolve this issue and note that PEM solves
4. A good question to ask is when should the MIC be calculated. I suggest
during origination the calculation be done after any processing
associated with the content-type header and just prior to the
content-transfer-encoding processing. Upon receipt the reverse applies:
after the content-transfer-encoding processing but prior to the
5. Where should the MIC appear? I believe there are two choices: as an
attribute of the content-type header or as a separate header, e.g.,
Content-MIC-Value. Keep in mind there are two critical pieces of
information: the MIC algorithm used and the MIC value. You will probably
want to recommend one algorithm, for interoperability reasons, but you
need not register algorithm object identifiers. This is already done by
the revision to RFC 1115, which in fact references the all the definitive
specifications for the algorithms. There are other places you can point
for algorithm IDs, also.