On Mon, 18 Dec 2006 16:02:55 -0000, Tony Hansen <tony(_at_)att(_dot_)com> wrote:
Making the entirely reasonable assumption that "body" means exactly what
RFC 2822 defines it to mean, then here is what gets hashed in all of
those cases:
1) ordinary message with <body> of 1 non-empty line:
---------------------
barbazCRLF
---------------------
2) <body> consisting of 2 empty lines
---------------------
CRLF
---------------------
3) <body> consisting of 1 empty line
---------------------
CRLF
---------------------
4) <body> containing no lines
---------------------
CRLF
---------------------
5) message with absent <body>
---------------------
---------------------
I contend that the current wording in base-07 also requires that example
5 canonicalize into a
---------------------
CRLF
---------------------
Well that's a philosophical issue :-( .
The present texts instructs you to do various things with "the body". If
there is no body, then you can't do those things. So there is nothing to
canonicalize and nothing to hash. The the only sensible thing to do in
that case is to hash nothing. So I seem to disagree with you :-( .
But as to what we should actually do when this is sorted out should, I
think, be that which causes the "least astonishment". So one would hardly
expect the canonicalization to finish up with more lines that you started
with.
The easiest rule for everyone to understand is an invariant such as:
"After canonicalization, there will NEVER be an empty line at the end
of what remains to be hashed."
Even when the body doesn't exist, it still must be treated as having 0
lines following, which still canonicalize to a CRLF.
But even with my contention on case #5, I don't disagree with your
conclusions here:
I firmly believe that we *intended* to canonicalize each of these cases
into the empty body
---------------------
---------------------
Which is indeed the "least astonishing".
PS. For completeness, the only missing cases, after taking into
consideration RFC 3030 and MIME, are as follows. *These* are the reason
that the 0*CRLF rule was added and where it needs to be applied:
6) ordinary message with <body> of >1 non-empty line, not ending in CRLF
Content-Type: binary
Last-Header: foobarCRLF
CRLF
---------------------
somethingCRLF
anything
---------------------
7) ordinary message with <body> of 1 non-empty line, not ending in CRLF
Content-Type: binary
Last-Header: foobarCRLF
CRLF
---------------------
anything
---------------------
I think those cases are covered by the rule that tells you to add a CRLF
if there is no trailing CRLF anyway. But weird things can happen when you
have binary. Here are some more cases for you all to chew over:
8) Binary text with assorted line endings:
Content-Type: binary
Last-Header: foobarCRLF
CRLF
---------------------
anythingLF
somethingCRLF
nothingCR
---------------------
So do I fix that last line by adding CRLF, or by adding just LF, or by
leaving it alone?
Moreover, if I specify "l=1" in the Dkim-Signature, exactly which of those
lines gets included in the process? And if I do any sort of
canonicalization, is the count for "l=?" done before of after the
canonicalization (I presume before).
And if I specify "l=0" and then treat my case (5) as you proposed to treat
it, do I then have to hash that CRLF that appeared out of nowhere?
Canonicalization, you see, is tricky stuff to specify (as indeed is any
cryptographic process). And agreeing the count for "l=?" is another of the
things that has to be gotten right.
So, all in all, I think what is needed is an Appendix containing
interesting test cases for which everybody needs to get the same results,
and maybe another Appendix containing a model implementation (as the
punycode people did, which is as well because nobody would have understood
their algorithm without it).
--
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131
Web: http://www.cs.man.ac.uk/~chl
Email: chl(_at_)clerew(_dot_)man(_dot_)ac(_dot_)uk Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9 Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5
_______________________________________________
NOTE WELL: This list operates according to
http://mipassoc.org/dkim/ietf-list-rules.html