[Top] [All Lists]

Re: 3028bis open issue #3: require 2047 decoding?

2005-06-30 02:34:22

That is, support for RFC 2047 is only a SHOULD and not a MUST.  Do
we want to leave that as is or should it be made stricter, with a
MUST support RFC 2047, MUST support conversion of US-ASCII and UTF-8
and SHOULD support conversion of ISO-8859-1 and the US-ASCII subset
of ISO-8859-*?

I would prefer that, but:

   If implementations fail to support the above behavior, they MUST
   conform to the following:

      No two strings can be considered equal if one contains octets
      greater than 127.

To me, that states that if an implementation fails to convert a character
set to UTF-8, two strings can not be equal if one contains octets greater
than 127.  Assuming that all unknown character sets are one-byte character
sets with the lower 128 octects being US-ASCII is not sound.  But perhaps
that's not what was meant.

   MIME parts identified as using charsets other than UTF-8 as
   defined in [UTF-8] SHOULD be converted to UTF-8 prior to the match.

Shouldn't the implementation be free what to convert them to? It may
chose a different unicode representation.  Do we need to enforce UTF-8?

   If an implementation does not support conversion of a given
   charset to UTF-8, it MAY compare against the US-ASCII subset
   of the transfer-decoded character data instead.  Characters from
   documents tagged with charsets that the local implementation
   cannot convert to UTF-8 and text from mistagged documents MAY
   be omitted or processed according to local conventions.

That sounds more useful than RFC 3028 to me, but I slightly prefer to
match the raw transfer-encoded data.  Why bother decoding it, if you
can entirely unsure what it may be anyway?

Whatever the result is, I agree that comparisons for header and body
should be the same.


<Prev in Thread] Current Thread [Next in Thread>