[Top] [All Lists]

Re: 'header' test and whitespace

2005-11-25 10:02:17

Ned Freed wrote:

>> I see from Alexey's minutes that I was supposed to post the text that
>> I proposed.  I didn't remember that; sorry.  Philip started the
>> discussion, but didn't post the text.  I gave Philip XML, and here's
>> what I suggested:
>> <t>
>> Because the meaning of leading and trailing whitespace characters in
>> header
>> fields is ambiguous, and their survival in message transport and
>> processing
>> is inconsistent, ALL handling of message headers in Sieve MUST normalize
>> the header field values.  The normalization is similar to, but not
>> the same
>> as, unfolding (see RFC2822),
> I think the unfolding part needs to be the same as what's in RFC 2822.

I agree.

> and is done as follows:
>> <list style="number">
>>    <t>
>>      Remove leading and trailing whitespace characters from each line of
>>      the header field (multiple lines, in the case of multi-line
>> continuation).
>>    </t>
This step is actually not mentioned in RFC 2822.

Yes. I don't view this as part of the unfolding algorithm, however. But now
that you mention it, a better way to describe this is to do the unfolding
part first and then remove leading and trailing spaces. So the steps become:

(1) Remove all CRLFs.

(2) Remove leading and trailing spaces.

(3) Decode RFC 2047 and convert to utf-8.

The last two steps should be skipped in :raw mode. (The first step is retained
because folding points aren't supposed to have any semantics and are known to
change unpredictably. The same cannot be said for spaces or encoded words.)

>>    <t>
>>      Remove the delimiting CRLF from each line.
>>    </t>
>>    <t>
>>      Catenate the lines in order, inserting one ASCII space character
>> (0x20)
>>      between each pair.
> This makes the unfolding different from what's in RFC 2822. I think
> this is a
> mistake. There should not be any "insert space" operation here - RFC 2822
> section 2.2.3 simply calls for CRLFs to be removed.


>> Note that I didn't suggest RFC2047-decoding, but I think that's a
>> reasonable
>> addition to this.  Alternatively, we could specify that strings be
>> decoded
>> in comparisons (perhaps specified by an option like ":decode" or
>> ":raw").
> A :raw option makes sense. I actually have no problem with adding it
> to the
> base specification but others may disagree.

I am personally Ok with adding :raw to the base spec, but do we need a
new capability?

Yes, I'm afraid so. Perhaps something like rawheader? (I believe the header
test is the only one for which :raw makes sense.)