Received header Considered Pathetic (was Re: Revisiting RFC 2822 grammar (scratching the surface of Received issues))
2004-01-16 07:55:39
Bruce has raised an issue I've been thinking about a lot recently. The
"Received" header is woefully inadequate for spam tracing, and has a
history of syntactic cruft that will be hard to simplify. I would like
to see us design a replacement that is specifically designed to support
rapid automated spam tracing. I think that a combination of an
easily-parsed Received header with a well designed Mail Source Tracing
Protocol (by which you could automate certain queries to the various
nodes a message had passed through) would be an extremely useful tool
-- not a silver bullet, of course -- in the fight against spam, and in
particular in efforts to enforce antispam laws. Is anyone interested
in working on a spec for this? -- Nathaniel
On Friday, January 16, 2004, at 06:30 AM, Bruce Lilly wrote:
Pete Resnick wrote:
On 1/15/04 at 7:09 PM +0100, Arnt Gulbrandsen wrote:
5. obs-received merits discussion on its own. RFC 2822 says
obs-received = "Received" *WSP ":" name-val-list CRLF
which Bruce changes to
obs-received = "Received" *WSP ":" [CFWS] name-val-list [ ";"
[CFWS] obs-date-time ] CRLF
An incompatible change, but perhaps correct.
Yes, I think Bruce fixes a bug in 2822 here.
Received has issues, both w.r.t. 2822 and 2821.
First an historical overview: Received was first specified in RFC
821, also known as a
"time stamp line". There have been, and still are, discrepancies
between the 821/822 and
2821/2822 definitions of the field body. Received is one of the trace
fields, and as noted
in RFC 1123, was initially primarily examined by hand. 1123 permitted
adding information
actually useful for tracing (i.e. the peer IP address, as opposed to
the HELO string which is
far too easily forged), however gave no syntax for doing so [common
practice has been to
include it in a comment].
Nowadays, tracing Received fields by hand is far too labor-intensive.
Unfortunately, due
to the exponential growth of spam, it is necessary. We now have the
unfortunate situation
where the required information (at least in 2821) includes the
easily-forged HELO/EHLO
string, and the useful not-so-easily-forged connection information is
relegated to an optional
construct. Worse, in 2821 that optional construct is specified as
some sort of structured
comment, indeed it is indistinguishable from a comment.
What is really desirable is something that has reliable trace
information in machine-readable
form, perhaps with the easily-forged information relegated to
comments. But that's another
discussion...
Prior to 2822, a time stamp line (a.k.a. Received field) always
required a time stamp. 2822
permits a time stamp line w/o a time stamp, which is an oxymoron. I
don't personally like
that (IMO the time stamp should be mandatory), but that's what 2822
says, and that's part
of what is included in the syntax above.
Another issue, not included above, is an incompatibility which has
crept in. 821/822 required
SP before the semicolon which delimits the start of the date-time.
However, most MTAs
incorrectly omit that space [in some, such as sendmail, that can be
easily rectified via a run-
time configuration patch; others, such as qmail, have that error
hard-coded]. 2821 requires
CFWS immediately before the semicolon, but 2822 makes it optional.
Given 821/822/2821's
requirements, I'd be inclined to revise 2822 to make at least CFWS
mandatory before the
semicolon when generating a message, and given past (clearly wrong,
but quite widespread)
practice, I'd require being able to parse a Received field w/o CFWS
before the semicolon
(via obs- syntax).
One more remaining incompatibility between 2821 and 2822 lies in the
permitted constructs;
2821 permits a quoted string as an item value (via 2821's "String")
whereas 2822 has no
such provision. That shows up in the "id" component, which has a long
history of conflicts
between 821/822 (1123 tried to rectify the conflict, but only added
more confusion).
Yet another incompatibility is that 2821 permits a mix of angle-addrs
(a.k.a. Paths) and
addr-specs (a.k.a. Mailboxes) in a "for" component, whereas 2822
permits a single addr-spec
or multiple angle-addrs (and no mixture). It turns out that for
rather complicated reasons,
the 2821 provision for multiple addr-specs is rather difficult to
parse. Perhaps 2821's
successor should address that issue; in any event, let's at least
remove the remaining conflicts
between the 2821 and 2822 definitions one way or another.
I note also that there exist broken implementations which generate
cruft that cannot be parsed
even with 2822's exceptionally liberal rules. Here are some
real-world examples:
Received: from web197.nyc01.cbsig.net ([63.240.56.197])
by mx08.mrf.mail.rcn.net with smtp (Exim 3.35 #7)
id 1Af3Xn-0005vx-00
for blilly(_at_)erols(_dot_)com; Fri, 09 Jan 2004 15:47:47 -0500
Received: (qmail 28572 invoked from network); 9 Jan 2004 20:47:13 -0000
Received: from nychubg02.cbs.com (170.20.9.151)
by web197 with SMTP; 9 Jan 2004 20:47:13 -0000
Received: by nychubg02.cbs.com with Internet Mail Service (5.5.2656.59)
id <ZC7JJYV9>; Fri, 9 Jan 2004 15:41:17 -0500
That's one recent example; among the problems:
as noted above, the SP/CFWS-before-semicolon issue
id's other than properly-constructed msg-ids
RFC 821 does not permit day-of-week in the time stamp
missing from and/or by components in some cases
illegal (non-RFC 1700 cruft) in "with" components
there is no defined "Mail" item-name
Received: from panic.noceast.dws.disney.com (panic.corp.disney.com
[153.6.248.200])
by mail.disney.com (Switch-3.1.2/Switch-3.1.0) with ESMTP id
h9NCwuN4022589
for <blilly(_at_)erols(_dot_)com>; Thu, 23 Oct 2003 05:58:57 -0700 (PDT)
Received: from sm-flor-xc03.wdw.disney.com
(sm-flor-xc03.wdw.disney.com [172.16.177.30]) by
panic.noceast.dws.disney.com with ESMTP; Thu, 23 Oct 2003 08:55:03
-0400
Received: from sm-flor-xc01.wdw.disney.com ([172.16.177.21]) by
sm-flor-xc03.wdw.disney.com with Microsoft SMTPSVC(5.0.2195.5329);
Thu, 23 Oct 2003 08:59:43 -0400
Received: from SM-NYNY-XC01.nena.wdpr.disney.com ([167.13.137.76]) by
sm-flor-xc01.wdw.disney.com with Microsoft SMTPSVC(5.0.2195.5329);
Thu, 23 Oct 2003 08:59:42 -0400
Received: from sm-nyny-xm05.nena.wdpr.disney.com ([167.13.137.80]) by
SM-NYNY-XC01.nena.wdpr.disney.com with Microsoft
SMTPSVC(5.0.2195.6713);
Thu, 23 Oct 2003 08:59:41 -0400
That's even worse; additional problem is that parsing fails after the
(illegal) with
component on encountering a lone "SMTPSVC" (and Microsoft was informed
about that bug
in Windows 2000 well befor SP1; 3 service packs and as many years
later and the bug
still hasn't been fixed (it shouldn't take more than 10 seconds for a
competent
programmer to modify the source to a) use a legal value (ESMtp or
SMTP) in the with
component, or b) elide the optional with component, or c) put the
marketing BS in a
comment)...
For the record, I am NOT in favor of extending the syntax to accept
such cruft -- I wish
that certain purveyors of brokenware would clean up their acts.
#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- Re: Revisiting RFC 2822 grammar (quoted-pair), (continued)
- Re: Revisiting RFC 2822 grammar (obs-time-of-day), Bruce Lilly
- Re: Revisiting RFC 2822 grammar, Arnt Gulbrandsen
- Re: Revisiting RFC 2822 grammar (scratching the surface of Received issues), Bruce Lilly
- Received header Considered Pathetic (was Re: Revisiting RFC 2822 grammar (scratching the surface of Received issues)),
Nathaniel Borenstein <=
- making mail traceable (was Re: Received header Considered Pathetic), Keith Moore
- Re: making mail traceable, James M Galvin
- Re: making mail traceable, Dave Crocker
- Re: making mail traceable, James M Galvin
- Re: making mail traceable, Al Costanzo
- Re: making mail traceable, Keith Moore
- Re: making mail traceable, James M Galvin
- Re: making mail traceable, Al Costanzo
- Re: making mail traceable, Keith Moore
- Re: making mail traceable, James M Galvin
|
|
|