Re: Revisiting RFC 2822 grammar (scratching the surface of Received issues)
2004-01-16 04:42:12
Pete Resnick wrote:
On 1/15/04 at 7:09 PM +0100, Arnt Gulbrandsen wrote:
5. obs-received merits discussion on its own. RFC 2822 says
obs-received = "Received" *WSP ":" name-val-list CRLF
which Bruce changes to
obs-received = "Received" *WSP ":" [CFWS] name-val-list [ ";"
[CFWS] obs-date-time ] CRLF
An incompatible change, but perhaps correct.
Yes, I think Bruce fixes a bug in 2822 here.
Received has issues, both w.r.t. 2822 and 2821.
First an historical overview: Received was first specified in RFC 821,
also known as a
"time stamp line". There have been, and still are, discrepancies
between the 821/822 and
2821/2822 definitions of the field body. Received is one of the trace
fields, and as noted
in RFC 1123, was initially primarily examined by hand. 1123 permitted
adding information
actually useful for tracing (i.e. the peer IP address, as opposed to the
HELO string which is
far too easily forged), however gave no syntax for doing so [common
practice has been to
include it in a comment].
Nowadays, tracing Received fields by hand is far too labor-intensive.
Unfortunately, due
to the exponential growth of spam, it is necessary. We now have the
unfortunate situation
where the required information (at least in 2821) includes the
easily-forged HELO/EHLO
string, and the useful not-so-easily-forged connection information is
relegated to an optional
construct. Worse, in 2821 that optional construct is specified as some
sort of structured
comment, indeed it is indistinguishable from a comment.
What is really desirable is something that has reliable trace
information in machine-readable
form, perhaps with the easily-forged information relegated to comments.
But that's another
discussion...
Prior to 2822, a time stamp line (a.k.a. Received field) always required
a time stamp. 2822
permits a time stamp line w/o a time stamp, which is an oxymoron. I
don't personally like
that (IMO the time stamp should be mandatory), but that's what 2822
says, and that's part
of what is included in the syntax above.
Another issue, not included above, is an incompatibility which has crept
in. 821/822 required
SP before the semicolon which delimits the start of the date-time.
However, most MTAs
incorrectly omit that space [in some, such as sendmail, that can be
easily rectified via a run-
time configuration patch; others, such as qmail, have that error
hard-coded]. 2821 requires
CFWS immediately before the semicolon, but 2822 makes it optional.
Given 821/822/2821's
requirements, I'd be inclined to revise 2822 to make at least CFWS
mandatory before the
semicolon when generating a message, and given past (clearly wrong, but
quite widespread)
practice, I'd require being able to parse a Received field w/o CFWS
before the semicolon
(via obs- syntax).
One more remaining incompatibility between 2821 and 2822 lies in the
permitted constructs;
2821 permits a quoted string as an item value (via 2821's "String")
whereas 2822 has no
such provision. That shows up in the "id" component, which has a long
history of conflicts
between 821/822 (1123 tried to rectify the conflict, but only added more
confusion).
Yet another incompatibility is that 2821 permits a mix of angle-addrs
(a.k.a. Paths) and
addr-specs (a.k.a. Mailboxes) in a "for" component, whereas 2822 permits
a single addr-spec
or multiple angle-addrs (and no mixture). It turns out that for rather
complicated reasons,
the 2821 provision for multiple addr-specs is rather difficult to
parse. Perhaps 2821's
successor should address that issue; in any event, let's at least remove
the remaining conflicts
between the 2821 and 2822 definitions one way or another.
I note also that there exist broken implementations which generate cruft
that cannot be parsed
even with 2822's exceptionally liberal rules. Here are some real-world
examples:
Received: from web197.nyc01.cbsig.net ([63.240.56.197])
by mx08.mrf.mail.rcn.net with smtp (Exim 3.35 #7)
id 1Af3Xn-0005vx-00
for blilly(_at_)erols(_dot_)com; Fri, 09 Jan 2004 15:47:47 -0500
Received: (qmail 28572 invoked from network); 9 Jan 2004 20:47:13 -0000
Received: from nychubg02.cbs.com (170.20.9.151)
by web197 with SMTP; 9 Jan 2004 20:47:13 -0000
Received: by nychubg02.cbs.com with Internet Mail Service (5.5.2656.59)
id <ZC7JJYV9>; Fri, 9 Jan 2004 15:41:17 -0500
That's one recent example; among the problems:
as noted above, the SP/CFWS-before-semicolon issue
id's other than properly-constructed msg-ids
RFC 821 does not permit day-of-week in the time stamp
missing from and/or by components in some cases
illegal (non-RFC 1700 cruft) in "with" components
there is no defined "Mail" item-name
Received: from panic.noceast.dws.disney.com (panic.corp.disney.com
[153.6.248.200])
by mail.disney.com (Switch-3.1.2/Switch-3.1.0) with ESMTP id
h9NCwuN4022589
for <blilly(_at_)erols(_dot_)com>; Thu, 23 Oct 2003 05:58:57 -0700 (PDT)
Received: from sm-flor-xc03.wdw.disney.com (sm-flor-xc03.wdw.disney.com
[172.16.177.30]) by panic.noceast.dws.disney.com with ESMTP; Thu, 23 Oct 2003
08:55:03 -0400
Received: from sm-flor-xc01.wdw.disney.com ([172.16.177.21]) by
sm-flor-xc03.wdw.disney.com with Microsoft SMTPSVC(5.0.2195.5329);
Thu, 23 Oct 2003 08:59:43 -0400
Received: from SM-NYNY-XC01.nena.wdpr.disney.com ([167.13.137.76]) by
sm-flor-xc01.wdw.disney.com with Microsoft SMTPSVC(5.0.2195.5329);
Thu, 23 Oct 2003 08:59:42 -0400
Received: from sm-nyny-xm05.nena.wdpr.disney.com ([167.13.137.80]) by
SM-NYNY-XC01.nena.wdpr.disney.com with Microsoft SMTPSVC(5.0.2195.6713);
Thu, 23 Oct 2003 08:59:41 -0400
That's even worse; additional problem is that parsing fails after the (illegal)
with
component on encountering a lone "SMTPSVC" (and Microsoft was informed about
that bug
in Windows 2000 well befor SP1; 3 service packs and as many years later and the
bug
still hasn't been fixed (it shouldn't take more than 10 seconds for a competent
programmer to modify the source to a) use a legal value (ESMtp or SMTP) in the
with
component, or b) elide the optional with component, or c) put the marketing BS
in a
comment)...
For the record, I am NOT in favor of extending the syntax to accept such cruft
-- I wish
that certain purveyors of brokenware would clean up their acts.
#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- Re: Revisiting RFC 2822 grammar, (continued)
- Re: Revisiting RFC 2822 grammar (obs-time-of-day), Bruce Lilly
- Re: Revisiting RFC 2822 grammar, Arnt Gulbrandsen
- Re: Revisiting RFC 2822 grammar (scratching the surface of Received issues),
Bruce Lilly <=
- Received header Considered Pathetic (was Re: Revisiting RFC 2822 grammar (scratching the surface of Received issues)), Nathaniel Borenstein
- making mail traceable (was Re: Received header Considered Pathetic), Keith Moore
- Re: making mail traceable, James M Galvin
- Re: making mail traceable, Dave Crocker
- Re: making mail traceable, James M Galvin
- Re: making mail traceable, Al Costanzo
- Re: making mail traceable, Keith Moore
- Re: making mail traceable, James M Galvin
- Re: making mail traceable, Al Costanzo
- Re: making mail traceable, Keith Moore
|
|
|