[Top] [All Lists]

Re: Gen-ART LC review of draft-ietf-eai-utf8headers-09.txt

2008-03-23 15:26:08
Hi, Harald,

Thanks for the quick feedback (Gen-ART reviewers like this because we can 
remember writing the review, and at least part of what we were thinking 
about :-)

Looks like mostly goodness. If we're in synch, I dropped it from this 


1.2.  Relation to other standards

  This document also updates [RFC2822] and MIME, and the fact that an
  experimental specification updates a standards-track spec means that
  people who participate in the experiment have to consider those
  standards updated.

Process: The ID Tracker is showing this draft in Last Call status, but I
can't find (in the archive or in my personal folders) any Last Call
announcement, which I was looking for, in order to check how Chris 
the downref at Last Call time - I'm expecting that it will be quite
entertaining. Has anyone else seen such an announcement on IETF Announce?
Note: Intended status is Experimental.

The subject line of the Last Call was

Last Call: draft-ietf-eai-smtpext (SMTP extension for internationalized 
email address) to Experimental RFC

and covered 2 drafts; this may be why you did not find it.

Exactly right (I was scanning by subject). While I'm amazed that the downref 
isn't being called out in the Last Call announcement, I think RFC tracks and 
standards levels are so arbitrary that they are useless, so I'm not 
complaining - I was trying to figure out if there really had been a Last 
Call announcement sent, that's all.

4.  Changes on Message Header Fields

  This protocol does NOT change the definition of header field names.

technical: I'm confused here. Is this text saying "does not change header
field names"? I would have thought this specification is exactly changing
the definition of header field names...
It does not change the definition of header field NAMES (which remain 
ASCII), but changes the definition of header field BODIES (which used to 
be ASCII, but are now UTF-8).

  That is, only the bodies of header fields are allowed to have UTF-8
  characters; the rules in [RFC2822] for header field names are not
And this sentence is saying that. How can we express this more clearly?

Ah. You filled in the missing piece for me here. Perhaps something like

"This protocol does NOT change the [RFC2822] rules for defining header field 
names. The bodies of header fields are allowed to contain UTF-8 characters, 
but the header field names themselves must contain ASCII characters."

  Interoperability considerations:  The media type provides
     functionality similar to the message/rfc822 content type for email
     messages with international email headers.  When there is a need
     to embed or return such content in another message, there is
     generally an option to use this media type and leave the content
     unchanged or downconvert the content to message/rfc822.  Both of
     these choices will interoperate with the installed base, but with
     different properties.  Systems unaware of international headers
     will typically treat a message/global body part as an unknown
     attachment, while they will understand the structure of a message/
     rfc822.  However, systems which understand message/global will
     provide functionality superior to the result of a down-conversion
     to message/rfc822.  The most interoperable choice depends on the
     deployed software.

technical: not sure what the last sentence actually means. "We don't know
what the most interoperable choice will be"? Text in the same paragraph 
both choices are interoperable. If that text is correct, I don't 
what you're saying here.
Would it be better to say "the most useful choice"? It's likely to be the 
difference between a compliant MUA offering to dump the message to a file 
and displaying it as a message...

"The most useful choice" seems very reasonable. The current text seems to 
contradict other text in the same paragraph.

5.  Security Considerations

  Because UTF-8 often requires several octets to encode a single
  character, internationalized local parts may cause mail addresses to
  become longer.  As specified in [RFC2822], each line of characters
  MUST be no more 998 octets, excluding the CRLF.

clarity: s/CRLF/CRLF, even when UTF-8 characters are being used/

  Because internationalized local parts may cause email addresses to be
  longer, processes which parse, store, or handle email addresses or
  local parts must take extra care not to overflow buffers, truncate
  addresses, exceed storage allotments, or, when comparing, fail to use
  the entire length.

technical: this is great advice, but I don't understand how UTF-8 changes
the situation. If you aren't changing the 998-octet requirement, software
that breaks for UTF-8 would also break for ASCII headers with the same 
If someone uses another representation internally (for instance UTF-16), 
and has a 998-character buffer, that will sometimes fit into 998 octets of 
UTF-8, and sometimes not. The same goes in the other direction.... I'm 
sure others will think of other cases.

Thanks for the clear explanation here. This is headed in the right 
direction - I wasn't impressed with guidance that says "take extra care", 
but saying "must accommodate 998 characters (which may require more than 998 
octets, depending on the character set in use), and must not overflow 
buffers, ..." seems clear enough to me.

Hope this helped....

Extremely. Thanks for explaining, too.


IETF mailing list