Since this is a substantive comment on a document that is in
Last Call for Standards Track, I'm posting the note to the IETF
list. Since I use different addresses for the SMTP list and the
IETF one, I don't know when or if this will appear on the latter.
With the exception of the last point, this note addresses
Section 6.3 of this document (on Internationalization)
exclusively. I will post another note before the cutoff to
address other issues.
--On Friday, February 27, 2009 15:49 -0700 "J.D. Falk"
Dave CROCKER wrote:
Just to make this explicit, SM's note is to the rest of you,
not to me. I asked him to solicit comments from the group,
since I have only tried to make the document parrot what I8N
folk say. My own comprehension of the topic remains muddled.
Mine is similarly muddled, but I wonder...is it appropriate to
reference a recent (and presumably temporary) experiment
alongside documenting the current state of the art?
Particularly when email-arch is unlikely to be updated for
another five years?
Well, while my comprehension may be equally muddled, it isn't
for lack of trying and involvement.
A few observations:
(1) The use of international (non-ASCII) characters in message
bodies and content has been a done deal since MIME came in.
That should probably be said explicitly. If we were redoing
that work today, we would probably strongly recommend the use of
UTF-8, rather than alternate encodings, and might require a
charset parameter on all instances of text/plain. But no WG
has ever been willing to move on either of those two points.
(2) Given the requirement for an Internationalization
Considerations section, which this document honors with Section
6.3, handing the substance of that section off to an
non-consensus document (IMC's Mail-I18N) in what is clearly a
normative reference despite being listed as an informative one
(the previous sentence has essentially no content other than to
indicate that i18n is an "ongoing challenge") seems dubious and
(3) There are some useful things that can be said that are, at
this point, settled parts of the architecture or at of least the
relevant protocols. As suggested above, one is that the content
matter is settled and that UTF-8 is the winner in the character
set wars (although other things are certainly around). Another
is that work on i18n of email headers and addresses is
progressing, but that, until that work completes, IDNs can be
expressed in ACE form (with appropriate references). I would
personally avoid making anything resembling a normative
reference to the experimental documents, largely because they
introduce more new syntax and terminology that would then need
to be discussed.
(4) It is a nit, but "Because its origins date back to the use
of ASCII" leaves an impression that is not strictly correct. It
would be accurate to say that its origins date back to the time
when even the use of ASCII was controversial. However, the
origins have nothing to do with anything: the email architecture
of today is defined in ASCII terms. RFCs 5321, 5322, and 2045ff
are written in ASCII terms and require ASCII (except in
body-part content) as are most of the other protocols
referenced. It would be far more accurate to simply say that we
have an ASCII-based protocol suite that is gradually being
adapted to accommodate non-ASCII elements where those are
appropriate, with the current thread and model starting with the
introduction of text/ content-types in MIME.
(5) The document needs to be updated to reflect current
references. In particular, RFCs 5321 and 5322 were published
almost six months ago. They also contain some slight
adjustments to terminology and this document should be carefully
checked to be sure its terminology is still consistent with them.