ietf-mailsig
[Top] [All Lists]

Content-Digest: Digesting raw vs MIME encoded data

2005-07-17 23:32:49


Content-Digest represents the the digest of the raw entity data,
before any content-transfer-encoding.  For verifications, CTE
decoding is done first.

My question: Should the digest be computed on the CTE encoded form
instead?

Why I ask?  Multiple reasons:

* DKIM does it over the MIME encoded form.  This has advantage
  of being more efficient since digest computation, and verification,
  does not have to be MIME-aware.  Basically, the message is treated
  in the RFC-2822 domain and not the MIME domain.

* OpenPGP (RFC-3156) does signing over the MIME encoded entity
  (see Section 5), and not the original raw form.

* S/MIME (RFC-2633) does signing over the MIME encoded entity
  (see Section 3), and not the original raw form.

* For EDigest usage, MIME-aware CTE decoding must be done.  Will
  this be an extra burden for MTAs to deal with?

Therefore, what is the real meaning of Content-Digest?  Is it the
digest of a MIME entity or the digest of data that is subsequently
tranlated into a MIME entity?

If it is the later, an extension to Content-Disposition seems more
appropriate.  C-D already provides raw data specific attributes,
like filename and creation date.  Another parameter, "digest" could
be added for representing the digest of the raw format of the data.

  Content-Disposition: inline; digest="sha1:A233sdf..."

Since Content-Digest is designed to include header fields, this implies
that it is protecting the email "package" and not what it contains.
I.e.  It protects the MIME representation of a piece of data vs the
raw data itself.

Otherwise, if a body part should be CTE decoded first (for digest
verification), then should non-ASCII encoded words in header fields
be decoded first?  Since we are digesting the raw body data, why not
the raw header field data?

--ewh


<Prev in Thread] Current Thread [Next in Thread>