Re: 2821bis chapter 2




--On Saturday, 03 September, 2005 14:09 -0400 John Leslie
<john(_at_)jlc(_dot_)net> wrote:

John C Klensin <john+smtp(_at_)jck(_dot_)com> wrote:

--On Saturday, 03 September, 2005 00:46 +0200 Frank Ellermann
<nobody(_at_)xyzzy(_dot_)claranet(_dot_)de> wrote:

I prefer the usual style of 2119 keywords with a reference.


The definitions of those terms are different in 2821 than in
the 2119 usage, with 2821 following 1123 and related
traditions. The difference is significant.  Please review the
DRUMS archives if you are interested.


   I really don't think we can ask every reader to "Please
review the DRUMS archives".


I am not asking it of "any reader".  I think it is worth asking
of anyone who wishes to reopen that (and a few other) old
arguments.  The alternatives are that either (i) Frank (and you)
take it on faith that the choice was made rationally and after
some discussion and should not be lightly undone or (ii) someone
else takes responsibility for explaining the details of the
distinction to whomever is interested.  I've provided a bit more
explanation below, but more than that I just don't have time for.

   (I'm quite sure John Klensin is right about the difference:
I merely think it's asking entirely too much of the reader to
apply pre-2119 meanings in this document.)

   Alas, I don't know how much work would be involved in
updating to 2119 meanings...


Well, the problem isn't in the "updating" or the work to do it.
The problem is that 2821 (following in the tracks on 1123 and a
large series number of pre-2119 documents) is intended to be
normative and prescriptive.  Think about it as making statements
of the form of "this is what you MUST/SHOULD do because the WG
has concluded, based on experience, that this is the right thing
to do".  RFC 2119, by contrast, carefully circles around being
normative in that sense.

This difference between the definitions of 2119 and strong
conformance clauses is the reason why the use of 2119
definitions has never been required.

The specific difficulty arises in numbered paragraph 6 of RFC
2119, which, to save chasing references, reads:

# 6. Guidance in the use of these Imperatives

#  Imperatives of the type defined in this memo must be used
#  with care and sparingly.  In particular, they MUST only
#  be used where it is actually required for interoperation
#  or to limit behavior which has potential for causing harm
#  (e.g., limiting retransmisssions) For example, they must
#  not be used to try to impose a particular method on
#  implementors where the method is not required for
#  interoperability.

RFC 2821 imposes a number of requirements that, by some
interpretations, are "not actually required for interoperation".
One could attain the minimal requirements for interoperation by
moving to a least common denominator that includes consideration
of implementations that are only conformant to RFC 821 (not even
the additional requirements imposed by 1123) and, some would
claim, a fairly fanciful and/or creative reading of 821 at that.


The discussions in DRUMS were also played out against a
background of claims that the robustness principle permitted
receivers to _assume_ that senders would read the standards
narrowly and be extremely conservative about what they sent.
There were also sender claims that the same principle _required_
that receivers accept whatever garbage they decided to send.  It
was also claimed that the fact that many receivers were
permissive about some common sender behavior (e.g., spaces
between the colon and the "<" in MAIL and RCPT commands) made
that behavior normative and that, under 2119's definitions, 2821
was required to insist that all receivers accept those
variations.   DRUMS concluded, as previous email efforts had
concluded, that these interpretations of the robustness
principle were bizarre and that email interoperability and the
ability to add enhanced features through extensions would be
improved by requiring a more strict level of conformance than
that which was strictly required in practice by minimum
interoperability.

* 2.3.4 (numerical addresses)

2821 said "discourged", now it's SHOULD NOT (better).  STD 10
had an obsolete idea of host numbers #<integer> in addition
to domain literals.


A leftover from pre-IP days, or at least prior to the
distinction between a network part and a local host part of an
IP address.


   Hopefully we have at least rough consensus this deserves to
be deprecated...


As we did with 2821.  See A.6.4 of 2821bis and F.4 of 2821.

IMHO that paragraph needs more clean-up:


| Hosts are known by names (see the next section); they
SHOULD | NOT be identified by <address-literals> (see
section 4.1.2). | Other forms of numerical addresses are
deprecated and MUST | NOT be used.


I've put the forward pointer in.  DRUMS pretty much decided to
avoid cluttering the 2821 text detailing prohibitions and the
#<integer> form is prohibited in A.6.4.  So, unless others
feel strongly about this, I'm going to leave that last
sentence out.


   I feel pretty strongly that a MUST NOT belongs here.

   That doesn't mean I expect to never see such violations. I
mean that folks _are_ going to be rejecting such nonsense and
I'd like to stop arguing how dreadful it is for them to do so.


Do you want to see "MUST NOT use" statements associated with
every one of the deprecated 821 features listed in the appendix,
noting that there is a MUST NOT in the appendix.  I'm willing to
put them in, just sensitive to the (quite valid) claims about
too much redundant material, what I recall as the DRUMS decision
to make this a spec about what should be done rather than a
commentary on differences from 821, and some desire for
consistency.

   (Clearly, there will be ignorant folks configuring SMTP
clients with garbage HELO strings: I don't expect implementors
to prevent it. I just want it clearly legal to choose to
reject such garbage.)


If you are specifically worried about HELO/EHLO strings, please
suggest changes to the text there, rather than the more general
discussion of host names.  It seems to me that the current text
is very specific about what it permitted: (i) FQDNS that
resolved to A RRs or the use of bracketed, standard-format,
address literals are specified in 2.3.5, (ii) the syntax for
EHLO requires either a domain name or a literal with an optional
explanation in 4.1.1.1 and the syntax in the same section for
HELO permits the domain name only.  That seems pretty specific
to me: anything else can reasonably be treated as a syntax
error.   The only problem with "rejecting such garbage" is the
prohibition on looking up the domain name and rejecting the
message if it is not found or not valid.  That prohibition was
introduced in RFC 1123 and appears in 2821bis-00 in paragraph 6
of 4.1.4 ("An SMTP server MAY verify that...").

...

At the end of 2.3.8:

[[Note in draft/Placeholder: There has been a request to
expand ...

Yes, something about gateways acting as SMTP originator.
They should not invent a Return-Path to the mail-originator,
unless they are also the MX and / or it's clearly in the
best interest of the mail originator.


This is tricky, and, again, more discussion is requested.  I
would suggest that, if the gateway cannot determine a reverse
path (not the header Return-Path, which can be supplied only
by the delivery MTA), then either

 (i) it has no business injecting the mail into the SMTP
     environment or
(ii) it is acting, not as a gateway, but as a submission
server.


   I'm not sure our definitions of "gateway" and "submission
server" can be mutually exclusive.

   But I think here we need to dwell on "cannot determine".

   When we talk of a "gateway" we imply some difference in the
environment: thus we introduce a doubt whether there _can_ be a
"reverse-path" with the same meaning in both environments.

   We should not be demanding that the "gateway" find such a
thing: we should be asking the gateway to generate a useful
return-path to reach an entity capable of remediating errors
which may occur.


I think we are in agreement.  I am not clear about how to fix/
improve the text.

   I also note that some folks want to attach a lot of baggage
when they say "submission server" -- so I'm not sure we're all
talking the same thing when we use that term.

   Trying to decipher what Frank said, he talks of "gateways
acting as SMTP originator" and his context appears to be how to
distinguish between "originator, delevery, relay, and gateway"
systems.

   I take that to mean Frank sees gateways as a combination of
"delivery" systems in one environment and "originator" systems
in another. (I'm not sure that is our intent in 2.3.8...)

   Beyond that, I guess Frank is concerned that if the gateway
"originates" an email with <return-path> being any domain for
which the management of incoming MX is different than the
management of the gateway, we've got a danger of abuse via
generated NDNs going to innocent parties.


To make things even worse, the outbound path (from foreign mail
environment through gateway to SMTP environment) may not be the
reverse of the inbound path (from SMTP environment through
gateway to foreign mail environment).  This has actually been a
common case.  For example, during the latter part of the active
BITNET/EARN period, most hosts ended up with Internet domain
names of forms like host.university.edu or its national
variations.  When mail was send to one of those hosts from the
SMTP environment, the gateway was selected based on the MX
records associated with the host's FQDN.  By contrast, while it
was sometimes invisible to the user, mail bound for the SMTP
environment from the BITNET/EARN mail one went through gateways
that were chosen by the local host administrator, without
benefit of MX processing.   

But an old story applies here: in neither case are the gateways
chosen at random.  The choice is under the control of the
"foreign" system, either in how it does its outbound mail
routing or how it sets up its MX environment.  If a gateway is
chosen that will do random or hostile things, there isn't much
that can be done about that.   All we can hope to do is to
define what sorts of behavior are, or are not, random or
hostile.  If someone can suggest an appropriate sentence or two
and where to put it, I think it would be fine to say that a
gateway should not inject a non-NDN message into the SMTP
environment unless it can determine an appropriate reverse path
and insert it into the (constructed) MAIL FROM field.   If that
is what you (and Frank) are getting at, please suggest text.

   Frank has a good point there: if such NDNs come back through
the same gateway or a coordinated system, there clearly can be
enough information to properly direct them to a responsible
entity. Otherwise, it's really not clear.


See above.

   The situation actually is very similar to generic
originators: there may be very good reason to trust
reverse-paths generated by MUAs; but the _responsibility_ for
the goodness of return-paths belongs at the originating SMTP
client. The difference in the case of a gateway is that the
reason to trust MUAs may be absent.

   In pre-2821 days, <reverse-path> actually carried the
concept of path of responsbility. We _might_ need to discuss
whether 2821 threw out too much of the baby along with the
bathwater... :^(


That baby was gone, to a considerable extent, long before 2821.

The fact that the mail originated on another system, in a
different format or protocol, is almost irrelevant to a
submission server, it presumably has, and follows, its own
sender-authorized rules about how to set the reverse path.


   Not all gateways can presume sender authorization of the
rules it follows. This has led to lenience which is now being
actively abused. :^(


Yes, but see above.

...

      It is designed around very high expectations that a
      legitimate user, even an anonymous one, sending
      interpersonal mail to a legitimate user or mailbox,
      will see that mail delivered, and delivered quickly.
      It does not look for excuses to discard messages because
      they do not conform to one rule or another.


   Current practices, alas, often do just that.

   While I do not wish to discuss them in this document; we
should be careful how deep we bury our heads in the sand...


Agreed

      Changing the base SMTP specification in significant ways
      to accommodate the short-term battle against spam or
      other forms of abuse in ways that violate those
      assumptions is pathological because, on one hand, it
      gives us a damaged mail structure forever.


   Certainly, changing SMTP to (attempt to) match current
anti-spam practices is likely to lead to long-term damage.

      Conversely, if the hypothesis that we will get the
      present situation under control is false, SMTP
      modifications are not going to be sufficient: we will
      need a permission-based, or rigid-authority-based, mail
      system and SMTP will die regardless of what
      tuning-level changes are made to it.


   There may be a middle ground: try to forgive me for
believing that some of the features of current SMTP being
exploited by spammers and other abusers might actually be
things we could aim at fixing.


As long as whatever is proposed are fixes, rather than changes
that cripple correct/normal operation, I'd even support such
changes.

   I choose to put my efforts into extensions to SMTP rather
than trying to guess at SMTP-NextGeneration. Thus, I will
listen carefully to any proposals to strengthen accountability
(where such is desired by both sender and receiver) or to
improve opportunities for cooperative efforts to track
problems. And I remain particularly interested in how to
generate useful NDNs when reverse-paths seem to have been
forged.

   I will try to be well-behaved, not insisting that folks
support half-baked ideas. But I think we'd be unwise to stick
too firmly to principles which have been abandoned by a large
majority of actual SMTP servers.


No disagreement.  But, as I have said many times before, we know
how to eliminate spam if the community has the will to do so and
is willing to accept the costs.  Despite the concerns and
hysteria, I see little evidence of either of those criteria yet
beng met.  I find the combination of concerns and hysteria with
lack of will profoundly depressing.

      john