On Oct 20 2004, Justin Mason wrote:
As you said previously, the Received header is always prepended. This can
be exploited more simply for the Processed header, as long as the
Processed header is *also* prepended *only*, and never appended, by each
Just a quick explanation on why quoting text is perhaps a good idea.
If each piece of software which writes a Processed header always puts
it at the top, then you're right it's enough to look for the next
Received header down to discover where the Processed header was added,
so there's no need to quote ID or datestamp.
The ID or datestamp quoting described is intended to cope with
Processed headers which are either added somewhere else (e.g. many
filters add headers at the end of the list - sometimes it's simply
easier to add at the top, sometimes it's simply easier to add at the
bottom), and also to cope with possible third party rearranging of
headers (because header order is only guaranteed for Received headers
Finally, if you have a MUA like Outlook or Lotus Notes, the header is
split up into individual database elements, so you simply lose the
ordering. If you write an addon for such a MUA to do spam filtering
say, then you can't rely on physical ordering of the Processed header
directly for trusting. However, you can still reconstruct the ordering
of the Received headers by comparing the BY and FROM fields in such a
case. Then you could reconstruct which Processed headers go with which
Received headers by looking at the quoted ID or timestamps. Not
trivial, but then again Outlook is not a very mail friendly
That gets around the problem of identifying ID or datestamp strings from
arbitrary formats of Received headers, which is certainly a hard problem,
I can tell you. (we do it for some bizarre reason ;)
identifying date-time: unfold the Received line, normalize spaces,
look for the first ';' from the end and take everything after that.
identifying ID: unfold the Received line, take the first string after
the keyword "id" if it exists (if you want to be more correct, remove
comments (in parentheses), then look for the keywords from, by and
pick the first time you see " id " after that. Or you can properly
parse the line according to RFC 2822 and try to figure out the id.
;-) In practice, a standard which used this would give a preferred algorithm.
By the way, there's an even easier way to quote a piece of the
Received line for use within a Processed header field:
unfold the Received line, normalize spaces, then compute a hash value
of the whole line. Then if you want to verify whether a Processed
header is associated with a given Received line, do the same thing and
check the hash value.
Received: from 185129182.virtua.com.br (185129182.virtua.com.br
[18.104.22.168]) by smtpin-3211.bay.webtv.net (WebTV_Postfix+sws)
with SMTP id 7280B11DCC; Fri, 5 Mar 2004 19:00:48 -0800 (PST)
This string could hash to the value B17C07174B5E4546A2B04EB096E83FD7081936B8
and then you would have
Processed: name="SpamAssassin"; location-ip="22.214.171.124";
The only reason for quoting id or date-time in the proposal is that
it's easier for a human to verify the quote by looking at the header
contents, because if it's a hash value then how do you compute the
hash in your head?