ietf-mta-filters
[Top] [All Lists]

Re: status of 3028bis

2005-10-20 12:48:50


On Thu Oct 20 14:59:04 2005, Alexey Melnikov wrote:
>
> Michael Haardt wrote:
>
>> On Tue, Oct 18, 2005 at 02:06:00PM -0700, Philip Guenther wrote:
>>
>>> In particular, there's some uncertainty
>>> over whether comparators take octets or characters as input and
>>> how
>>> i;octet is defined and used in the definition of other
>>> comparators.
>>>
> [...]
>
>> Let "i;octet" work on characters and document the name is a
>> misnomer.
>>
>> IF someone really feels the need to specify and implement them:
>> Introduce
>> Sieve extensions for a "i;binary" comparator that compares against
>> a
>> string representation of binary data.  Introduce new tests
>> "rawheader"
>> and "decodedheader".
>>
>> I am not aware how "i;octet" is used at other places, so the above
>> may
>> only fit well to Sieve.
>>
> Your suggestion make sense for header fields. But what about body
> extension when matching against application/* MIME part?
>
>
We need to restrict this discussion to just the one mailing list,
really, but I've posted a message saying that actually, the reverse
is true - comparators match on octet strings, and happen to have a
decode built in - hence i;octet doesn't decode, and i;ascii-* both
decode using ASCII.

Wow, I'm sure glad you said this, because this has always been my
interpretation as well, and I was getting the idea I was all alone in seeing it
this way. It definitely is how our implementation works too.

The notion that comparators work on character strings is a notion
that comes pre-flawed - ACAP does not operate on character strings,
but octet strings, which might on a good day happen to be UTF-8
encoded text, but might be anything.

Yep.

I've also suggested that where all the protocol has is a character
string, then the semantics of a comparator must behave as though the
string were encoded using UTF-8 (possibly by actually doing so).

But not necessarily by doing so. In many cases it is more convenient to decode
to a fixed-length representation like UTF-32. The behavior is what matters, not
the implementation details.

                                Ned

<Prev in Thread] Current Thread [Next in Thread>