[ietf-dkim] Possible C14N incorporating MIME decoding

Note change of Subject at request of Barry Leiba. The original thread(Introducing Myself) was fine when I first joined this list with a longlist of issues I was concerned about, but it is well past its sell-by-datenow.

In due course, this needs an I-D draft with a definite proposal, but theissues could use a little more informal discussion first.

On Thu, 07 Dec 2006 09:56:41 -0000, Charles Lindsey <chl(_at_)clerew(_dot_)man(_dot_)ac(_dot_)uk>wrote:

On Wed, 06 Dec 2006 14:58:55 -0000, <Bill(_dot_)Oxley(_at_)cox(_dot_)com> wrote:
Nice code, now during your testing how many messages (average messagesize today 3k) per second were you able to process and on what machine.I need something that can do about 1200 messages per second per second.
Thanks,
I haven't done any speed testing yet, but will try some now.

I did some experimentation last night, but the outcome was that, whilstPerl is a fine tool for setting out clearly the essential features of analgorithm, it is of no help in estimating how fast it might run.

Being an interpretive language, which calls library subroutines to do theinteresting stuff, it can run terribly slowly, but then do somethingblindingly fast when it hits a subroutine written in C.

So yes, I could just about tell that decoding Base64 was faster thangenerating a SHA-256 hash, but not reliably by how much.

Essentially, however, the inner loop of what I am suggesting would looklike this:


Go through the input stream looking for CRLF.
   When you find one

Look for whitespace before it (to delete is at per the 'relaxed'c14n)

      Look for '--' after it to check whether you have possibly reached
          the end of the current part (of some multipart)
      Copy what you have got to the output stream with or without decoding
          of Q-P or Base 64 according to the CTE in force
Put the output stream through SHA-256.

Of all that, everything but looking for the '--' and the possibledecodoing of Q-P/Base64 have to be done for the present Relaxed c14n. Mybelief is that the SHA-256 will consume most of the machine cycles inthat, with the search for the CRLF uning quite a lot. Hence the additionof the decoding should not make a huge addition percentagewise. But itwould be necessary to rewrite the whole thing in C to get exact figures,and it would be premature to do that just yet.


--
Charles H. Lindsey ---------At Home, doing my own thing------------------------

Tel: +44 161 436 6131 Web: http://www.cs.man.ac.uk/~chl

Email: chl(_at_)clerew(_dot_)man(_dot_)ac(_dot_)uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5
_______________________________________________

NOTE WELL: This list operates according tohttp://mipassoc.org/dkim/ietf-list-rules.html