Re: [ietf-822] utf8 messages

On the contrary, it's quite well defined. See RFC 5321. The issue isn't that
it's poorly defined, but rather that there are a lot of agents that don't
create it properly.

Maybe because it was defined too late,


The Received: field was initially defined in RFC 821 in August 1982. MIME was
first published as RFCs 1341/1342 in June 1992, almost a decade later.

and not in the right place? (RFC 5321 is about SMTP not about MIME)


Which is, in fact, the right place, given that it's a transport trace field,
and we're talking about actions taken by the transport.

In fact MIME has almost nothing to do with any of this, other than providing
an example of what not to do.

From the parser's point of view the reason is
indifferent, the reality is that it is better not to rely on it. Even RFC
5321 says:

"...receiving systems MUST NOT reject mail based on the format of a trace
header field and SHOULD be extremely robust in the light of unexpected
information or formats in those header fields."


Which says nothing about possible dangers of extracting a single bit of
information in a robust way.

Successfully parsing a Received: header itself requires a lot of
heuristics.


A full parse does, and so does looking for IP address information (which
doesn't appear directly as a clause value and whose position was only
standardized late in the game). Looking for a with clause with a
particular
value does not.

Looks like we have quite different ideas about reliability and parsing.
I certainly would not consider the partial parsing approach you suggested
as reliable.


Nonsense. I'm talking about searching for a pair of adjacent tokens in a
string. The parsing risks are false positives - unlikely given the tokens
involved - and false negatives - also unlikely given that anything generating
these tokens is going to be new code written to conform to these
specifications.

And as Arnt has pointed out, you also have control over the agent generating
the line you care about most of the time.

Actually, the risks of parsing issues are dwarfed by the other issues with
using this field: That it is present because of the message envelope, not the
header, that it was added to an incompliant message in order to make it
transport-safe, that the field wasn't added when it should have been, and so
on.

Those are the reasons you're better off examining the message yourself
and not trusting other agents. And all of those reasons apply equally to
any new header field we would defined at this point.

Uh huh. And neither is whatever new header is being proposed here. Why
is one preferable to the other?

Because
1) the Received: header is already used and abused in many ways


You have yet to provide a case for that.

2) as you admitted, there are a lot of agents that don't create it properly


See above.

3) semantically it does not makes sense to put the charset information in
   the Received: header (it is meant to be a trace field)


We're not talking about charset information.

4) if we define a new field, we don't need to worry about finding the
newly
   defined field with bogus syntax in historic emails sent before the
standard
   was published


You really think there's enough of a chance of "with smtputf8" appearing
the Received: fields of mail created with dates after the publication of
the EAI specifications that this represents a significant risk over and
above the other manifest risks of using the field?

You appear to be operating in a world entirely different from my own. I
therefore have nothing further to say to you about this point.

                                Ned

_______________________________________________
ietf-822 mailing list
ietf-822(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf-822