ietf-mta-filters
[Top] [All Lists]

Re: 'header' test and whitespace

2005-11-24 18:56:08

I see from Alexey's minutes that I was supposed to post the text that
I proposed.  I didn't remember that; sorry.  Philip started the
discussion, but didn't post the text.  I gave Philip XML, and here's
what I suggested:

<t>
Because the meaning of leading and trailing whitespace characters in header
fields is ambiguous, and their survival in message transport and processing
is inconsistent, ALL handling of message headers in Sieve MUST normalize
the header field values.  The normalization is similar to, but not the same
as, unfolding (see RFC2822),

I think the unfolding part needs to be the same as what's in RFC 2822.

and is done as follows:
<list style="number">
   <t>
     Remove leading and trailing whitespace characters from each line of
     the header field (multiple lines, in the case of multi-line continuation).
   </t>
   <t>
     Remove the delimiting CRLF from each line.
   </t>
   <t>
     Catenate the lines in order, inserting one ASCII space character (0x20)
     between each pair.

This makes the unfolding different from what's in RFC 2822. I think this is a
mistake. There should not be any "insert space" operation here - RFC 2822
section 2.2.3 simply calls for CRLFs to be removed.

   </t>
</list>
</t>

<t>
To show how this normalization works, we use the character "~" (tilde) to
represent the ASCII space character in the following example.
This normalization will result in all of the following normalizing to the
same value for the subject field, "a~b~~~c~d":
<list style="empty">
   <t>
     Subject:~a~b~~~c~d
   </t>
   <t>
     Subject:a~b~~~c~d~
   </t>
   <t>
     Subject:a~b~~~c~
     <vspace/>
     ~~~~d~~
   </t>
   <t>
     Subject:~~~~~a
     <vspace/>
     ~~~~b~~~c~~~~~
     <vspace/>
     ~~~~d
   </t>
   <t>
     Subject:~a
     <vspace/>
     ~b~~~c
     <vspace/>
     ~d
   </t>
</list>
</t>

I didn't bother going through all this.

Note that I didn't suggest RFC2047-decoding, but I think that's a reasonable
addition to this.  Alternatively, we could specify that strings be decoded
in comparisons (perhaps specified by an option like ":decode" or ":raw").

A :raw option makes sense. I actually have no problem with adding it to the
base specification but others may disagree.

                                Ned