ietf-822
[Top] [All Lists]

Re: RFC 2047 and gatewaying

2003-01-11 17:55:07

D. J. Bernstein <djb(_at_)cr(_dot_)yp(_dot_)to> schrieb/wrote:
Sometimes I edit header fields. For example, I might change the Subject
line. What do you think happens if I correct the spelling of an RFC 2047
French word with an accent? That's right: I'm editing an encoded word.

Of course, display is critical. I need to read what I'm editing. I can't
read RFC 2047 gobbledygook. My text editor is not going to support RFC
2047 and IDNA GoofyCode and all the other complicated ad-hoc ``7 bits
forever!'' character encodings.

It's the task of the user agent to present the message to the user in a
friendly format. This can easily be done for header fields whose format
is known. Data that can't be represented on the local system can be kept
in encoded form, so this also works on systems that don't support
Unicode.

It's easy for a user agent to translate the changed data back into
transmission format. For header fields not known to the user agent, the
user is responsible for the correct encoding.

The crucial advantage of UTF-8 is its generality: it can and will be
used everywhere.

That can actually be a problem w.r.t. addresses (domain names, email
addresses, newsgroup names): An editor does not guarantee anything about
the normalisation form of its output. A user might enter a character
that has some compatibility decomposition and will never understand why
his message bounces, does not appear in a newsgroup, etc.
When the user agent does a conversion from a display and edit format to
a transmission format, it can normalise the data at the same time.

Of course, if we were designing a completly new message format, UTF-8  
would be the right choice. But we are not. Of course, adding more and  
more kludges like RFC 2047, RFC 2231, Punycode, etc. to maintain  
compatibility are not the most elegant solution... but it's hardly the  
worst part of the RFC 2822/1036 message format.

Claus
-- 
------------------------ http://www.faerber.muc.de/ ------------------------
OpenPGP: DSS 1024/639680F0 E7A8 AADB 6C8A 2450 67EA AF68 48A5 0E63 6396 80F0

<Prev in Thread] Current Thread [Next in Thread>