[Top] [All Lists]

Re: Gen-ART LC review of draft-ietf-eai-utf8headers-09.txt

2008-03-24 04:45:44
Spencer Dawkins skrev:
Hi, Harald,

Thanks for the quick feedback (Gen-ART reviewers like this because we 
can remember writing the review, and at least part of what we were 
thinking about :-)

Looks like mostly goodness. If we're in synch, I dropped it from this 


1.2.  Relation to other standards

  This document also updates [RFC2822] and MIME, and the fact that an
  experimental specification updates a standards-track spec means that
  people who participate in the experiment have to consider those
  standards updated.

Process: The ID Tracker is showing this draft in Last Call status, 
but I
can't find (in the archive or in my personal folders) any Last Call
announcement, which I was looking for, in order to check how Chris 
the downref at Last Call time - I'm expecting that it will be quite
entertaining. Has anyone else seen such an announcement on IETF 
Note: Intended status is Experimental.

The subject line of the Last Call was

Last Call: draft-ietf-eai-smtpext (SMTP extension for 
internationalized email address) to Experimental RFC

and covered 2 drafts; this may be why you did not find it.

Exactly right (I was scanning by subject). While I'm amazed that the 
downref isn't being called out in the Last Call announcement, I think 
RFC tracks and standards levels are so arbitrary that they are 
useless, so I'm not complaining - I was trying to figure out if there 
really had been a Last Call announcement sent, that's all.
I actually don't see a downref here - this is an Experimental updating a 
Draft Standard (or Full; I don't remember current status well). If 
anything, this is unusual as an upref, not a downref....

4.  Changes on Message Header Fields

  This protocol does NOT change the definition of header field names.

technical: I'm confused here. Is this text saying "does not change 
field names"? I would have thought this specification is exactly 
the definition of header field names...
It does not change the definition of header field NAMES (which remain 
ASCII), but changes the definition of header field BODIES (which used 
to be ASCII, but are now UTF-8).

  That is, only the bodies of header fields are allowed to have UTF-8
  characters; the rules in [RFC2822] for header field names are not
And this sentence is saying that. How can we express this more clearly?

Ah. You filled in the missing piece for me here. Perhaps something like

"This protocol does NOT change the [RFC2822] rules for defining header 
field names. The bodies of header fields are allowed to contain UTF-8 
characters, but the header field names themselves must contain ASCII 
That seems like a good editorial suggestion to me. Thanks!

  Interoperability considerations:  The media type provides
     functionality similar to the message/rfc822 content type for email
     messages with international email headers.  When there is a need
     to embed or return such content in another message, there is
     generally an option to use this media type and leave the content
     unchanged or downconvert the content to message/rfc822.  Both of
     these choices will interoperate with the installed base, but with
     different properties.  Systems unaware of international headers
     will typically treat a message/global body part as an unknown
     attachment, while they will understand the structure of a message/
     rfc822.  However, systems which understand message/global will
     provide functionality superior to the result of a down-conversion
     to message/rfc822.  The most interoperable choice depends on the
     deployed software.

technical: not sure what the last sentence actually means. "We don't 
what the most interoperable choice will be"? Text in the same 
paragraph says
both choices are interoperable. If that text is correct, I don't 
what you're saying here.
Would it be better to say "the most useful choice"? It's likely to be 
the difference between a compliant MUA offering to dump the message 
to a file and displaying it as a message...

"The most useful choice" seems very reasonable. The current text seems 
to contradict other text in the same paragraph.

5.  Security Considerations

  Because UTF-8 often requires several octets to encode a single
  character, internationalized local parts may cause mail addresses to
  become longer.  As specified in [RFC2822], each line of characters
  MUST be no more 998 octets, excluding the CRLF.

clarity: s/CRLF/CRLF, even when UTF-8 characters are being used/

  Because internationalized local parts may cause email addresses to be
  longer, processes which parse, store, or handle email addresses or
  local parts must take extra care not to overflow buffers, truncate
  addresses, exceed storage allotments, or, when comparing, fail to use
  the entire length.

technical: this is great advice, but I don't understand how UTF-8 
the situation. If you aren't changing the 998-octet requirement, 
that breaks for UTF-8 would also break for ASCII headers with the 
same octet
If someone uses another representation internally (for instance 
UTF-16), and has a 998-character buffer, that will sometimes fit into 
998 octets of UTF-8, and sometimes not. The same goes in the other 
direction.... I'm sure others will think of other cases.

Thanks for the clear explanation here. This is headed in the right 
direction - I wasn't impressed with guidance that says "take extra 
care", but saying "must accommodate 998 characters (which may require 
more than 998 octets, depending on the character set in use), and must 
not overflow buffers, ..." seems clear enough to me.
I think it's more like "must accomodate 998 octets, and not send more 
than 998 octets, even though the relationship between this number and 
the number of UTF-8 characters is not a simple one". I see that Klensin 
has picked up on this for 2821, too.

Thanks for the review!

IETF mailing list