>>How about the following for the second paragraph of 2.7.2:
>> Comparisons are performed in Unicode. Implementations convert
>> text from header fields in all charsets [HEADER-CHARSET] to
>> Unicode as input to the comparator (see 2.7.3). Implementations
>> MUST be capable of converting US-ASCII, ISO-8859-1, the US-ASCII
>> subset of ISO-8859-* character sets, and UTF-8. Text that the
>> implementation cannot convert to Unicode for any reason, MAY be
>> omitted, treated as plain US-ASCII (including any [HEADER-CHARSET]
>> syntax), or processed according to local conventions,
>
>
> Thought I sent a note about this, but cannot find it... Anyway, I think this
> is fine, although I'd be tempted to out "treat as plain US-ASCII" first on the
> list.
Definitely first on the list. I actually think this *is* best practice,
despite Ned's having said he doesn't think there is a best practice on
this, and, in particular, I'd like to eliminate "omitted".
I certainly could live with omitting "omitted".
Consider:
Subject: =?bogus-charset?Q?Buy Viagra now!?=
Do you really want my rule that says
if (header :contains ["subject"] ["viagra"]) {
discard;
stop;
}
to be ignored because the spammer put in a bogus character set name
(perhaps purposefully, to screw up Sieve scripts)?
Good point.
Ned