ietf-smtp
[Top] [All Lists]

Re: [ietf-smtp] Should we update an RFC if people refuse to implement parts of it ?

2021-05-30 20:19:43
Gentlemen,

An observation and a question...

--On Sunday, May 30, 2021 15:49 -0700 Ned Freed
<ned(_dot_)freed(_at_)mrochek(_dot_)com> wrote:

On Thu, May 27, 2021 at 12:59:53AM +0800, Jiankang Yao wrote:

I recognize the distinction while also realizing that, as
you know, things often leak.  My recollection (if
Jiankang is following this, his memory is probably better
than mine) is that the WG explicitly discussed the issue
and concluded that U-labels were a better idea than
A-labels.

Yes, I think so. The EAI WG discussed this issue.  Section
3.7.3. Trace Information  encourages to use UTF-8 form.
One reason I think is that trace information will be put
into header for human reading.

But, and this is crucial, the human reading the trace
information is rarely either the sender or the ultimate
recipient of the message, who are generally presented with a
subset of the headers fields ("To", "Cc", "Date", "Subject"
...).  Examination of trace headers is far more likely to a
task for a mail system administrator.  They're used in abuse
reports and the like, and a uniform representation is more
important than familiarity to the community of readers of
some given language.

And what the admin usually wants to do is either a comparison
or check the domain with the DNS in some way. So an A-label
can be more convenient.

And in the unlikely event an admin needs to translate the
A-label to a U-label, there are an abundance of tools that I
can use to do it.

I may live in a different world, or at least with different
users, than the two of you, but I quite often see users pointed
to trace fields in order to make estimates of the validity of a
message.  And that brings me close to the reasoning I think the
WG used more generally: that it was better to stick to "native
character" forms (in this case, U-labels), especially when there
was a possibility of a full mailbox name with a UTF-8 local part
and a domain part containing IDN labels.   That is obviously not
the case for clauses of "Received:" fields other than "for",
but, if the WG extrapolated from that principle to this
particular case (and I don't remember how much detailed
attention the particular case got.  The other argument for
U-labels is that, precisely as Ned points out, if an admin needs
to translate an A-label, there are many tools and admins are
likely to know where to find them.  By contrast, if a user is
presented with an A-label, the reaction is at least as likely to
be "WT<rude word>" as "I have a tool handy that does that".

A different piece of the same story is that, if I had much more
confidence that authors of MUAs and other mail access tools
would think carefully about these subtle issues and get them
right, I'd say that, given the dual relationship between
A-labels and U-labels, it makes no difference which of the two
are used on the wire -- it is all a presentation issue.  On the
other hand, my experience of the last decade or so has given me
no such confidence.

...
Sure, if the text is Russian, some Latin-based alphabet or at
a stretch Greek, I can more easily distinguish one U-label
string from another than an A-label form like
"xn--b1adqpd3ao5c.org", ... and yet I'd much rather see
A-labels in trace headers than Arabic or Chinese.  The text
in 3.7.3 is not something I'm inclined to implement.

And you might feel differently if, instead of reading primarily
Greek, Latin, and Cyrillic, your primary familiarity was with
Chinese or Devanagari or Mongolian or Thai or Arabic or anything
else that is very different from those three very closely
related scripts.   And your ability to distinguish U-labels in
the three scripts you cite may be severely impaired if someone
is deliberately trying to deceive given the number of characters
that are reused among the three scripts and, in many type
styles, almost indistinguishable.  

And, yes, the conclusion from the above is that sometimes
A-labels are better and sometimes U-labels are better and that
sometimes it depends on the reader.

Actually, it depends on the A-labels. Because of the
compression involved A-labels often emphasize small
differences that may be difficult to see in a crap monospaced
font for a non-Latin script, even one you're familiar with.

Exactly.

Specifying the use of U-labels in the "from" and "by" clauses
rather looks like a bad judgement call, rough consensus or
not.  Until the Protocol Police show up, I'm sticking with
A-labels. :-(

Me too.

IMO, wfm.  The text says "SHOULD", the two are easily
convertible by anyone who knows where to find the tools (and
that will far more often be admins than end-users even those I'm
concerned about the latter too) and, for "from" at least, simply
copying the EHLO field (which necessarily uses A-labels for any
IDN label fields) seems to me to be a fairly strong argument.
But that isn't the question in the subject line that started
this thread: should we open the document and change it, perhaps
to say "SHOULD...A-label".   Well, I'm not convinced that is it
worth the trouble, especially given all of the other issues we
are having getting SMTPUTF8 universally understood and deployed.

On the other hand, and here comes the question: There are
proposals floating around that would define new header fields
that would reflect, include, or depend on, different sorts of
forward-pointing and reverse-pointing addresses.  Should we be
taking the position that none of those should move forward
unless they explicitly address what should be done when those
fields are not traditional ASCII ones?

best,
  john

_______________________________________________
ietf-smtp mailing list
ietf-smtp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf-smtp