Re: Standardized tag (was: draft-klensin-rfc2821bis-10: ABNF discuss)


John,

Thanks for the reminder -- I just remembered why we concluded
that specifying the specific characters that might be used in
the tag was not necessary...

--On Thursday, 03 July, 2008 17:12 -0400 John Leslie
<john(_at_)jlc(_dot_)net> wrote:

...

Is there a any problem with defining a rule for what
characters are allowed in future tag for identifying literal
address formats that  is large enough to motivate living
with the interoperability issue?


   I see no problem -- it's mostly a matter of finding a round
tuit.

A point raised during the discussion was whether something is
valid  because it is permitted by the syntax.


   Obviously, the answer is "maybe". ;^)

   Even purest garbage can be "valid". And, a lot of the time,
we don't even need to process the "address" for the mail
system to function. At first blush, I'd say it's sufficient to
specify how to isolate the string which constitutes the
"address". We can punt to other document(s) how to make sense
of it.


And, for that purpose, the rule is "[" ... "]" or, if one
prefers regular expressions (and thinks that backslash is the
escape), 
    \[.*\]

I still don't see the suggestion to restrict the tag to an
alphanumeric string starting with a letter as harmful and think
it would be a good idea.   2821 defined the tag as an "ldh-str"
and neither Tony nor I can reconstruct why we changed that -- in
retrospect, I suspect it just got lost during fine-tuning of
some other bit of metasyntax, which may actually reinforce the
view that such tampering is a bad idea.   As a minimal
alternative, one could be explicit that the tag not contain "]".
But, as far as interoperability problems are concerned, if
(deliberately not using ABNF to avoid getting sucked into
nit-picking)

    "[" digit ...
is an IPv4 address, and
    "[" "IPv6:" ...
is an IPv6 address, anything else is unacceptable until and
unless the standard is updated to reflect a different type of
literal.   Given the way that SMTP is specified, that means
there is no interoperability problem.   Address literals are
permitted in very few contexts (MAIL and RCPT commands and, with
some restrictions, EHLO and Trace fields).   If, e.g., "[xyz
..." were encountered in one of them, the result would be either
a syntax error or an address rejection, which are fine.  Whether
one could find the end of the tag, or even the end of the
address, in the command has to do with what text went into the
string associated with the reply code, but those strings are not
specified normatively in any event.  And trace fields are a
non-issue because there is an explicit warning against machine
interpretation of them.

Put differently, give the way SMTP and its syntax work, not
being able to correctly interpret whatever follows an
unintelligible address literal in a command line is really of no
particular consequence.

So, whatever the issue here and whatever tuning improvements
might be possible, I don't see an interoperability problem, even
a potential future one.

One might have to go back all the way to 821 to fully
understand the  rationale of what's in this draft.  And even
then, it's not that  obvious unless you talk to the people
who actually wrote the text.


   One shouldn't need to "fully understand the rationale" in
order to use this spec. But one _should_ find sufficient
specification to know how to parse the constituent parts.


Indeed.  But part of the question has been about the rationale
for the decisions.

     john