Content-Digest & Edigest fields draft

As far as drafts submitted within last few days, I finished first version ofthe draft for new digest header fields and submitted it yesterday:

 http://www.metasignatures.org/draft-leibzon-content-digest-edigest-00.html
 http://www.metasignatures.org/draft-leibzon-content-digest-edigest-00.txt

Note: I've attempted before to make this post including draft as an
      attachment but I don't that worked. Sorry if you receive another
      copy of this post later...

There have also been number of changes from the description I gave before

about EDigest. The biggest is that there are now two header fields -Content-Digest and EDigest. Content-Digest can be used as replacementof MD5-Digest with similar rules, i.e. it should be added only by theoriginator of transmission and not by intermediate agents and there cannot be more then one Content-Digest in the MIME or message header. Itssyntax is the same as EDigest but without "u" tag/parameter and so itonly provides digest hash for the content in which header it appears.


The syntax for Content-Digest & EDigest has been changed slightly from
the syntax I last mentioned about EDigest:
 1. The syntax and description is now similar to MIME fields and same
    parser can be used to extract parameters and work with the header
    field. That means that ";" is now separator between different tags
    (which are now called parameters as per MIME convention). Full ABNF
    syntax is included in the draft.
 2. Canonicalization parameter "c" has been extended to provide
    information separately about header fields canonicalization and
    about content body data canonicalization. Full specification is
    now "c=simple,mimeform" (this is default value). There has also
    been changes in canonicalization method names and in fact how to
    do canonicalization is described quite a bit in detail (too much
    probably, takes disproportionate amount of space in draft).
 3. Parameter indicating number of bytes in canonicalized content is now
    called 's' meaning "size".
 4. Cryptographic algorithm 'a' parameter is now optional and by default
    it is assumed to be "sha1". That means there is now basically only
    two required parameters - "v" and "d" making simple cases of using
    Content-Digest and EDigest very easy and compact.
 5. Time of creation and unique id 't' parameters is now using number
    in ISO8601 format (instead of unix seconds), like t=20050704142754
 6. The URLs in 'u' are now listed in same way as in References header
    in email, i.e. they are enclosed in "<..>" and separated by FWS.
    If URI is not specified it is taken to be "cid:" default
 7. 'e' (encoding) has been dropped from the spec. As with MD5-Digest
    the data used should be what is before transfer-encoding is applied.

    Note: I'm still thinking about this. It seems to be more correct
    standard-wise to require to "decode" the transfer encoding, before
    creating hash of content, but as far as creating digest hash at
    intermediate servers (EDigest) that is not convenient. But this
    probably only makes a difference for quoted-printable.

Now regarding canonicalization, as mentioned methods of doing it have
been specified in detail and are now different for header and for body.
Both header and body now support 'bare' canonicalization method which
is somewhat similar to "simple" in DK and basically means take the
data as is (i.e. no canonicalization - previously I called this "all").

Header fields default canonicalization method is now 'simple' which
requires that all multi-line header fields be wrapped back into single
line (and properly terminated with CRLF), that repeated whitespace
characters be changed into single one and header field name be changed
to lowercase.

"Nofws" canonicalization for header fields is now that all non-printablecharacters be removed all together (i.e. only characters with ascii codebetween 33 and 126 remain).


For body data new 'text' canonicalization is somewhat like 'simple'
for header and requires that all text lines be properly terminated
with CRLF and that multiple white-space characters are replaced with
single one. 'nofws' is also available for body and requires removal
of all special characters and CRLF or any other line terminators.

For default for data body there is now special canonicalization name'mimeform' which is not canonicalization method but defaults to

'text' for "Content-Type: text/*" and to bare for all other content.

I forgot to add Acknowledgments section (hardly the only thing I forgot,security considerations is at the minimum too), but there were severalpeople who sent me comments on the 0.18 and 0.21 Edigest and that isappreciated. In particular I'd like to think Earl Hood for his commentsthat helped quite a bit in production of the draft. More comments areobviously welcome both in public and private.


-----
William Leibzon, Elan Networks:
 mailto: william(_at_)elan(_dot_)net
Anti-Spam and Email Security Research Worksite:
 http://www.elan.net/~william/emailsecurity/

<Prev in Thread]	Current Thread	[Next in Thread>
Content-Digest & Edigest fields draft, william(at)elan.net <=

Previous by Date:	Re: DKIM, Tony Hansen
Next by Date:	Re: DKIM, Dave Crocker
Previous by Thread:	DKIM, Dave Crocker
Next by Thread:	Listing of DKIM implementations, Dave Crocker
Indexes:	[Date] [Thread] [Top] [All Lists]