ietf-smtp
[Top] [All Lists]

Re: Strict RFC x821 Compliant: HELO/EHLO

2005-07-05 09:29:42




--On Sunday, 03 July, 2005 10:12 -0700 Claus Assmann
<ietf-smtp(_at_)esmtp(_dot_)org> wrote:

See the discussion in RFC 1123 section 2.1: "a valid host
name can never have the dotted-decimal form #.#.#.#, since at
least the highest-level component label will be alphabetic."

Ok, let's try this again: The statement was:

"This would be an illegal syntax per the ABNF:"

This statement is wrong. The parameter is _syntactically_
valid (as I pointed out before by quoting the grammar).  Maybe
it would be useful to change the syntax to reflect that "the
highest-level component label will be alphabetic.". As long as
that isn't the case, the command above is correct according to
the grammar given in RFC 2821 (that's why my software doesn't
reject it in contrast to the _syntactically_ invalid MAIL
command that triggered this HELO/EHLO discussion).

Short answer: "yes".   And changing the syntax would, IMO, be a
terrible idea.  The mail syntax is unambiguous and doesn't
require any trick parsing or whole-string inspection: as Frank
(I think) pointed out in a later note, "no brackets, no domain
literal" and anything else is construed as a domain name.  If
there are additional constraints on the latter, they are imposed
from somewhere else (RFC 1123 in this case).  Similar comments
apply to that prefix string for non-IPv4 literals: essentially,
simple parsers with little or no look-ahead are, for SMTP at
least, considered good while tricky heuristics to deduce what is
coming or happening are considered unfortunate and/or risky.
Unless there is a good reason for the risk, why incur it?



--On Sunday, 03 July, 2005 14:08 -0400 Hector Santos
<hsantos(_at_)santronics(_dot_)com> wrote:

Section 4.1.3 Address Literals specifically defines the format
for each of the above.  For the sake of simplicity, using IPv4:

      IPv4-address-literal = Snum 3("." Snum)
      Snum = 1*3DIGIT  ; representing a decimal integer
            ; value in the range 0 through 255

This defines an alphanumeric dotted string format.

Actually, a dotted-string format whose fields (between the dots)
are strictly digits.  I'd describe this as a dotted-numeric
string, not alphanumeric at all.  This is important below...
 
But 2821 does not specifically define or say the
address-literal could not be a A record in DNS.

Huh?  I'm not sure what you are getting at here, but you
probably need to look at 4.1.2, where it defines address-literal
(the only context in which the above can appear) as

   address-literal = "[" IPv4-address-literal /
                     IPv6-address-literal /
                     General-address-literal "]"

Please note those nasty [...].  So, as above, address literals
cannot be domain names.   And domain names cannot be address
literals.  Full stop.


As long as that isn't the
case, the command above is correct according to the grammar
given in RFC 2821 (that's why my software doesn't reject it
in contrast to the _syntactically_ invalid MAIL command that
triggered this HELO/EHLO discussion).

You are technically correct per 2821, but 2821 *UPDATES* 1123,
hence 2821 overrides various parts in 1123, but other items in
1123 not updated by 2821 still take hold.

Sigh.  It is part of a completely different thread, but this is
yet another example of why category-concepts like "updates" are
insufficient and something like ISDs are needed.

So 1123 covers the definition of what is considered a valid
TLD based domain vs. a dotted-address format.

In a section on the DNS, not SMTP.  Essentially, that section
governs what TLDs can be created without causing problems.  SMTP
is (and has always been) more conservative, because it doesn't
rely on the form/syntax of a domain name to tell it from a
literal.

Remember, 1123 came first before 2821 so you will find systems
following its guidelines as well.

It is perfectly rational to prohibit all-digit domain names in
an SMTP system by virtue of knowing about the 1123 rule.  But,
as far as 2821 is concerned, that is a semantic restriction, not
a restriction imposed by the SMTP-based syntax.   If one does
apply that prohibition, rather than just looking the thing up,
the cautions of RFC 3696 apply -- while I'd predict that ICANN
wouldn't get stupid, TLD names could, in principle, have only
one alphabetic character and still not violate the intent of the
1123 rule.

--On Sunday, 03 July, 2005 18:48 -0400 Hector Santos
<hsantos(_at_)santronics(_dot_)com> wrote:

So we have the domain-literal,  for 2821bis, we need to make
sure it clearly defines what are all the possibilities.

It does.   The key is the square brackets, which define what is
or is not a address-literal.  Anything else that appears in one
of those positions is a domain name.  Period.  Then there are
restrictions on address-literals and restrictions on domain
names, but they are, as nearly as possible, high-level syntax
restrictions only.

Claus raised a valid point, how do you know it is not a
domain?   I think 1123 clears that up.   Something for John to
include in 2826bis.

I assume you mean 2821bis.  But, if I have my way, no.   See
above.

I agree with John. RFC 2821 is right to specify a general syntax for names and
leave semantic restrictions out.

And yes, I am tracking these discussions although reluctantly.

Ditto.

                                Ned