ietf-smtp
[Top] [All Lists]

Re: [ietf-smtp] [dispatch] BCP proposal: regular expressions for Internet Mail identifiers

2016-04-02 02:15:01
Hello...

On Mar 31, 2016, at 1:28 AM, Valdis(_dot_)Kletnieks(_at_)vt(_dot_)edu wrote:

On Tue, 29 Mar 2016 15:38:59 -0400, John C Klensin said:

Actually five families if you want to do a comprehensive job:

- 5321, possibly with nods to its predecessors
- 5322 which, as you point out, is not the same as 5321
   (and most, if not all, of the differences are
   intentional)
- the EAI family
- the base DNS spec family, as updated

And the corner cases when they don't agree. Consider
user@yoyo_dyne.com  - how long did *that* wart get debated? :)

Those of us who were around for RFC1341 can look at the following,
and weep, and ponder what failure modes the authors of this would have
managed if they had *both* an ABNF and a regexp provided to work from,
*even if they were semantically the same*...

% egrep 'X-Mail|alt' bad.mailfile
X-Mailer: IBM Notes Release 9.0.1FP2 SHF37 August 25, 2014
Content-Type: multipart/alternative; boundary="=_alternative  
002EDD9148257F79_="
--=_alternative 002EDD9148257F79_=
--=_alternative 002EDD9148257F79_=
--=_alternative 002EDD9148257F79_=--
--=_alternative
%

(Hint:  You'll probably need a fixed-space font and a lot of pondering - the
above cost me close to 10 days of aggravation trying to figure out why one
vendor's support emails were consistently getting eaten by my procmail
filters, before I finally spotted it…)

Are you referring to the fact that the boundary has two spaces between
“_alternative" and “002E...”?

Easy enough to spot.

Not sure how easy it is to spot by eye, but this is why msglint utilities
exist: So you don't have to. The one available at:

   https://github.com/NedFreed/msglint

spots the problem instantly. (You're also supposed to be able to run that one
from www.apps.ietf.org, but that's been down for a long time despite several
requests to update its IP address.)

Not sure how that is germane to the topic at hand. I have not proposed writing
standardized regular expressions for boundary parts; it is not clear why such 
a
thing would be generally useful.

I don't see the relevance either.


This is why we can't have nice things....

Oh, and those who want to tempt the Lovecraftian regexp elder gods should
ponder the following:

http://www.ex-parrot.com/pdw/Mail-RFC822-Address.html

(If that doesn't make Sean reconsider, *nothing* will... :)

Does not make me reconsider at all.

<tongue-in-cheek>
Half the work is done, just copy-and-paste!
</tongue-in-cheeck>

A “deliverable” e-mail address is one that is deliverable with the modern
SMTP infrastructure, i.e., that complies with the modern formulations of RFC
5321 (and RFC 6531, for EAI). That class of e-mail addresses is interesting,
and much more straightforward to write a regular expression for.

As a practical matter, there's lots of software out there that use bogus
regexps to check addresses. Anything that could possibly improve this situation
is a win in my book. And people have occasionally been known to look at RFCs
and use what they find there.

So unless someone can explain to me a likely scenario in which this going to
make the general situation worse, I support the adoption and publication of
this document.

The other ones are also interesting and will be covered. However, I would
assume that non-mail processing implementers will want to focus on the modern
RFC 5321-compliant expression(s).

Personally, I'd settle for subaddresses working more than half the time with
web forms. I could not care less about support for obsolete syntax.

                                Ned

_______________________________________________
ietf-smtp mailing list
ietf-smtp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf-smtp