Content-Digest represents the the digest of the raw entity data,
before any content-transfer-encoding. For verifications, CTE
decoding is done first.
My question: Should the digest be computed on the CTE encoded form
instead?
Why I ask? Multiple reasons:
* DKIM does it over the MIME encoded form. This has advantage
of being more efficient since digest computation, and verification,
does not have to be MIME-aware. Basically, the message is treated
in the RFC-2822 domain and not the MIME domain.
* OpenPGP (RFC-3156) does signing over the MIME encoded entity
(see Section 5), and not the original raw form.
* S/MIME (RFC-2633) does signing over the MIME encoded entity
(see Section 3), and not the original raw form.
* For EDigest usage, MIME-aware CTE decoding must be done. Will
this be an extra burden for MTAs to deal with?
Therefore, what is the real meaning of Content-Digest? Is it the
digest of a MIME entity or the digest of data that is subsequently
tranlated into a MIME entity?
If it is the later, an extension to Content-Disposition seems more
appropriate. C-D already provides raw data specific attributes,
like filename and creation date. Another parameter, "digest" could
be added for representing the digest of the raw format of the data.
Content-Disposition: inline; digest="sha1:A233sdf..."
Since Content-Digest is designed to include header fields, this implies
that it is protecting the email "package" and not what it contains.
I.e. It protects the MIME representation of a piece of data vs the
raw data itself.
Otherwise, if a body part should be CTE decoded first (for digest
verification), then should non-ASCII encoded words in header fields
be decoded first? Since we are digesting the raw body data, why not
the raw header field data?
--ewh