Re: Draft for signed headers

In article 
<199903161516(_dot_)PAA19750(_at_)clw(_dot_)cs(_dot_)man(_dot_)ac(_dot_)uk>, 
Charles Lindsey
<chl(_at_)clw(_dot_)cs(_dot_)man(_dot_)ac(_dot_)uk> writes

Nit-picking on the canonicalization algorithm.

  1.   The field-name (header-name) at the start of the header is
       converted to lowercase.

  2.   If the header is unstructured, all instances of FWS are
       replaced by a single SPACE; otherwise (the header is
       structured and) all instances of FWS are omitted, except
       within comments where they are replaced by a single SPACE (the
       header has now been unfolded into a single line). Any
       whitespace at the end of the header is removed, and it is
       ensured that the header ends with a single CRLF.

        In RFC2234 space is SP, not SPACE.

  3.   All instances of DQUOTE (ASCII '"') are removed, except when
       they occur between properly matched pairs of "<" and ">"
       (thus, in particular, they are not removed within msg-ids).

                "<\">"@domain

        Are these < > properly matched?  Should this be canonicalized to

                <\">@domain

        or

                <\>@domain


        I can see what you are trying to do but I think that there may
        be several pitfalls here.

  4.   Any date-time occurring in a Date, Resent-Date or Expires
       header (but not in any other header) is converted into the
       number of seconds since the start of January 1st 1970 UTC,
       expressed as a decimal number without leading zeroes.

        As phrased the number of seconds since the start of January 1st
        1970 UTC includes leap seconds.  But this will give software a
        problem: how can it be written to cope with messages in the
        future? as future leap-seconds are undecided.

        Better to exclude leap-seconds, in which case you might need a
        note about what to do with a seconds value of 60.

  5.   Any sequence of octets of length not more than 75 and not
       including any SPACE (and hence presumed present in the same
       line prior to Step 2), and which satisfies the syntax for an
       encoded-word [RFC2047], and which is not enclosed between
       properly matched pairs of "<" and ">" is replaced by the
       sequence of octets obtained by decoding it. This is done
       irrespective of whether that encoded-word was syntactically
       allowed to be present at that position in the header according
       to [RFC2047] or any extension thereof.


                (\<)=?utf-8?q?=E2=82=A0?=(\>)

        Are these < > properly matched?  Should this be canonicalized to

                (\<)₠(\>)

        or not?


        (For those without utf-8 it's a euro symbol).

Regards

-- 
Paul Overell                                        T U R N P I K E  Ltd