[Top] [All Lists]

Re: [ietf-smtp] [dispatch] BCP proposal: regular expressions for Internet Mail identifiers

2016-03-29 15:14:29

--On Tuesday, March 29, 2016 11:31 -0700 Sean Leonard
<dev+ietf(_at_)seantek(_dot_)com> wrote:

I hope that this does go to show that raw/blind application of
the ABNF in RFC 5321/5322 is not sufficient.

Sorry, but the ABNF in 5321/2821 and the BNF in 821 have _never_
been sufficient to define the grammar in complete and closed
form.  It sets some structure and bounds on the grammar, but the
prose explanations and details are at least as important.  If
you don't think the RFC makes that clear, please file an erratum
-- it might even motivate me to take 5321bis out of long-term
storage (although that is definitely not a promise -- the
aggravation costs are provably very high).

The danger is real, and noting it is appreciated. It's worth
considering that we are not talking about one spec, but two
families of specs (the email specs and the DNS specs) that we
need to summarize and put together.

Actually five families if you want to do a comprehensive job:

 - 5321, possibly with nods to its predecessors
 - 5322 which, as you point out, is not the same as 5321
        (and most, if not all, of the differences are
 - the EAI family
 - the base DNS spec family, as updated
 - the IDNA family (2003, 2008, and maybe assorted
        mapping and deviant (i.e., encouraging something that
        violates IDNA2008)
It turns out that the domain part is 50% of an email address
but generates perhaps 85% of the complexity. The quoting rules
for local-part are arcane but at least are fairly systematic.
There is a question about how much it's the responsibility of
an "email address validator" to validate the domain part.

If one believes in IDNA2008 --which there is some obligation to
do until it is replaced or deprecated-- then there are actually
some very clear validation rules.

I do not wish to answer this question in isolation. On the one
hand, it's usually a DNS library's "job" to answer that (not
an email library per-se); on the other hand, if it's not a
good domain name, the email address is literally pointing to
an imaginary place. The answer is, I suppose, "it depends" and
the Best Current Practice is to document the issue so
qualified engineers can make sound judgments about what to do.
I would analogize this to a US Postal Service validator,
validating two-letter state-and-political-division
abbreviations. Every state-or-political-division has a
two-character alphanumeric code: enforcing the two-character
requirement and the alphabetic requirement in a validator
would be separately reasonable if the relevant USPS standards
promise the same. However, avoiding repeated characters (AA,
BB, CC) seems to be more of a registration
practice/requirement, so a validator need not impose such a
requirement if the relevant standards do not call for it.

Right.  And that isn't actually a common practice at the second
or third level or below in most domains.  There, within other
constraints that are documented, the rule tends to be either
"you get whatever you want and can pay for" or "you get whatever
you can convince the zone admin is reasonable".


ietf-smtp mailing list

<Prev in Thread] Current Thread [Next in Thread>