ietf-822
[Top] [All Lists]

Re: [ietf-822] Most common mail header fields seen with nonsyntactic values

2018-07-24 22:17:25
Adding additional mailing lists because three header fields listed below 
(Received-SPF, Authentication-Results, Arc-Authentication-Results) are within 
their scope.

--Peter

From: Peter Occil
Sent: Monday, July 23, 2018 11:28 PM
To: ietf-822(_at_)ietf(_dot_)org
Subject: Most common mail header fields seen with nonsyntactic values

The following is a list of email header fields where I find a significant 
proportion of those fields in practice using a different form from the 
documented syntax of those fields.

This list is, for the moment, for your information only.  Whether the documents 
defining the header fields listed below should be updated (whether to 
accommodate how those fields are used in practice or otherwise), or what error 
handling a program should use if it encounters any of these header fields, are 
matters that require further discussion.  (In one case, the note in RFC 5322 
provides guidance on error handling, but in many other cases, those documents 
don't seem to suggest or require any particular error-handling behavior.)

ARC-Authentication-Results.

Some nonsyntactic values of this header field contain a "header.b" parameter 
value containing a slash, which cannot occur in a "pvalue".

Authentication-Results.

Many nonconforming Authentication-Results values are of an unusual form that 
I've already reported elsewhere, in the "dmarc" mailing list.  Unlike most of 
the other forms I report here, this one may be truly nonconforming.

Other nonsyntactic Authentication-Results values--

- don't mention the domain name of the authentication server (they generally 
have a comment like "(sender IP is ...)"),
- contain a "header.b" parameter value containing a slash, which cannot occur 
in a "pvalue",
- contain an "x-tls.subject" parameter right after the authserv name (which 
only one specific implementation apparently generates), and/or
- contain "d=<pvalue>" or "reason=<pvalue>" after the form "<method>=<result> 
(comment)", which doesn't conform to the documented syntax.


Content-ID.

Of the Content-ID header fields I've seen in practice, a significant proportion 
of them (almost half) do not follow the syntax of "msg-id", even though they 
contain angle-brackets.  Some examples use UUIDs inside angle-brackets rather 
than "msg-id"s with an at-sign, while other examples, such as "<example.jpg>" 
and "<down_arrow>", were obviously generated to be message-unique rather than 
"world-unique" as required by RFC 2045 sec. 7.  (On the other hand, I see very 
few instances of Message-ID header fields not following the syntax of that 
header field.)  A smaller number of fields do not use angle-brackets at all, 
and some of them include the values "html-body" and "text-body".

List-Archive.

All of the nonsyntactic List-Archive values I've seen so far involve GitHub 
URLs.  Here the URL appears without angle brackets.

List-ID.

Some List-ID values either include no dots or domain names, or they are numbers 
or underscore-separated number sequences with no angle-brackets.

List-Unsubscribe.

Many nonsyntactic List-Unsubscribe values involve either URLs not appearing in 
angle brackets, or URLs encoded with RFC 2047 encoded words (compare with 
Content-Location, which does allow the latter).

Received.

Many nonsyntactic Received bodies--

- include fractional seconds in the date and time,
- include unquoted IPv6 addresses (which contain colons and don't conform to 
the "received-token" syntax), 
- have no semicolon before the date/time, and/or
- include a "for" clause containing "<multiple recipients>" (without the 
quotation marks).

In one case, I have noticed a Received header field with an ASCII control 
character (U+0001, I think) in the "by" clause; unfortunately such a field is 
not downgradable under RFC 6857, nor can it appear in a generated header field 
under RFC 5322.

Received-SPF.

Many nonsyntactic Received-SPF bodies include an unquoted IPv6 in the 
"client-ip" parameter (which conform to neither "dot-atom" nor "quoted-string" 
because of the colons), and some include an unquoted email address in the 
"envelope-from" parameter.

Return-Path.

Many Return-Path header-field values don't include angle brackets (and appear 
as "addr-spec", rather than "path" as required by RFC 5322.)  A very small 
number also include a display name (and appear as "mailbox" under that RFC).

-------

For other standard header fields, nonconforming values occur very rarely if at 
all (in my experience).

--Peter


_______________________________________________
ietf-822 mailing list
ietf-822(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf-822
<Prev in Thread] Current Thread [Next in Thread>