Re: UTF-8 over RFC 2047 (Re: Call for Usefor to recharter)


Jean-Marc Desperrier wrote:

Dan Kohn wrote :

I wrote:

I have a simple question.  What can a UTF-8 subject header
communicate that an RFC 2047 one can't?  Other than inelegance,
what's the downside of 2047, when the upside is a huge increase in

backward compatibility?



I do not know where this discussion took place, but I have an answer to it.


There has been substantial discussion on the ietf-822 mailing list. See
http://www.imc.org/ietf-822/mail-archive/maillist.html for the archive.

It's a simple fact.
In every single thread with non US-ASCII data in subject encoded byRFC2047 (sorry I wrote 2049 by error in my last mail) I've seen, thesubject turned to garbage after 5 or 6 messages.
The reason for that is that all implementations of RFC2047 around arefull of implementation errors.


There is one specific error that could cause that, namely improper use
of RFC 2047 for any purpose other than for display.  Once a header
field has been generated with RFC 2047 encoded-words, those
encoded-words should never be modified -- the content may be decoded for
display in the specified charset (and optionally, language), but the
header content should not be modified.

Surely the solution to that problem is to sorrect the faulty implementations
which are responsible for inappropriately modifying header content.

And here raw UTF-8 is a clear winner. No complex implementation rules,no border cases, one string will always have one and only onerepresentation.


That is not correct, simply because Unicode has multiple representations,
including non-spcing modifiers.  Therefore a utf-8 encoding of unicode
will also have multiple representations. Unless, of course some additional
"complex implementation rules" are applied for normalization.  And the
longer utf-8 sequences are effectively border cases, since different utf-8
specifications have different rules.

<Prev in Thread]

Current Thread

[Next in Thread>

Previous by Date:

Re: UTF-8 over RFC 2047 (Re: Call for Usefor to recharter), Dave Crocker

Next by Date:

Re: old format date fields (was Re: RFC 2047 and gatewaying), Bruce Lilly

Previous by Thread:

Re: UTF-8 over RFC 2047 (Re: Call for Usefor to recharter), Bruce Lilly

Next by Thread:

RE: Call for Usefor to recharter, Dan Kohn

Indexes:

[Date] [Thread] [Top] [All Lists]