John C Klensin wrote:
While I agree, please be _very_ careful here.
Yes, as soon as it's hardwired in millions of applications
the chances to modify it are lousy. The USEFOR WG has a lot
of fun with the IPv6 colons, because decades ago RfC 1036
said that the colon among other punctuation is a separator.
OTOH dancing around such issues as far as possible is not
always possible. At the moment the RfC 2821 syntax allows
digits and dots only. RfC 821 still had an <a> (alpha) at
the begin of all labels. RfC 2821 wants at least one dot,
RfC 821 didn't insist on a dot. Some fine tuning between
these positions should be okay for 2821bis. E.g. use the
<toplabel> as found in 3696 or 2396.
It's not absolutely necessary for the purposes of 2821bis,
but it makes sense elsewhere. Like the <CRLF> in 2234bis,
where in theory a CR / ( [CR] LF ) would be good enough -
in practice it's better to stick to one simple convention.
In some cases, that involved a lexical rule (e.g., "we
know there are no TLDs more than four characters long
That's a possible interpretation of "extremely unlikely"
in RfC 1591. It turned out to be wrong. Hardwired lists
were worse.
in actuality, because there is a firm prohibition on TLD
names being all-digits, it is _lots_ safer to handle one
of those as an error without looking it up than to assume
that .invalid, which is invalid today, will continue to
be invalid a year from now.
For invalid / test / example / localhost it's in BCP 32, if
somebody "unreserves" invalid or example in his LAN it's a
very bad idea.
For the <toplabel> I can't tell at the moment which BCP or
standards track RfC guarantees it - 3696 is informational,
1123 and 2181 apparently (?) don't say it, 2396 is obsolete.
Oops, 1738 is still okay, waiting for new file / ftp / news
and nntp URI RfCs, then it will be moved to historic. All
other 1738-URL-schemes are already extracted into new RfCs.
If 2821bis doesn't define a <toplabel> it's either lost, or
it's hidden somewhere deep within 1035, where I won't find
it - in that case a fresh definition in 2821bis can't hurt.
Bye, Frank