ietf-smtp
[Top] [All Lists]

Re: 2821bis-03: Receive line From-domain Address-literal usage

2007-04-28 09:26:46



--On Saturday, 28 April, 2007 00:26 -0400 Hector Santos
<hsantos(_at_)santronics(_dot_)com> wrote:

Hi John,

I came across something today that may be of item of interest
or may need some clarification.  I wasn't sure if this comment
should or not be part of Issue 27 since it is labeled as close
in your issues list).

More of this below, but, as a general observation, 2821 attempts
to be fairly careful to avoid specifying the expected behavior
by the server (or client) when the client (or server) violates
the protocol.   I hope we don't need to re-open that
meta-decision.

Quite possibly, in summary, it may be deemed a new issue,
possible two:

     - Clarifying Bracketed address-literals MUST requirement,
and
     - discouraging alterations of the address-literal for
       the trace line.

The two may be related and possible be folded into 1 issue.

Or it may be no issue at all... I'll leave that up to Tony and
general discussion.

The rest of my comments below are just personal opinion except
where noted as "Editor:".

So what prompted me to post a comment about a potential new
issue?

First, the Received: ABNF includes the From-domain entity:

    From-domain    = "FROM" FWS Extended-Domain
    Extended-Domain  = Domain /
                   ( Domain FWS "(" TCP-info ")" ) /
                   ( Address-literal FWS "(" TCP-info ")" )

For Address-literal, we have:

    address-literal  = "[" ( IPv4-address-literal /

                      IPv6-address-literal /
                      General-address-literal ) "]"
                      ; See Section 4.1.3

(Side Nit: Note the blank line)

Editor: Those blank lines (there are others) are an artifact of
the mechanism used in the ABNF to get those deep indentations
and a switch I made in the directives about spacing of list
items.  I'm gradually getting rid of them, but fixing this sort
of thing can also be left for the RFC Editor.

Of course, putting aside the issues of mixed non-compliant
practices in the field, per ABNF specification, the
address-literal MUST be enclosed within brackets.
...
 
I sent a test message which I quickly received at gmail.com.
I looked at the header lines and noticed something odd in the
gmail.com server adding of its Received: trace line:

  Received: from ?192.168.1.101? ( [68.215.50.125])
     by mx.google.com with ESMTP id
h17sm2595511wxd.2007.04.27.19.07.38;
     Fri, 27 Apr 2007 19:07:38 -0700 (PDT)

Note the from address-literal with the question marks -
?192.169.1.101?

The MUA client actually sent:

     EHLO [192.169.1.101]

First of all, note that most MTAs who use a 
    ... from domain-or-literal ( literal ) ...
format are announcing that the actual IP address of the
sender-SMTP was different from the IP address specified (or
associated with the domain name given).  They are prohibited (in
relays) and discouraged (in submission servers) from rejecting
mail on that basis, but warnings have long been considered
appropriate.

You, and others, have questioned that rule, both wrt NAT-based
addresses and about whether it has become a proper basis for
rejection.

Second, note that there are strong prohibitions against ever
having an RFC 1918 private address on the public Internet.   It
would be a plausible (albeit well outside of 2821)
interpretation that
    HELO [192.168.1.101]
to a server on the public Internet is complete nonsense and
that, even if the mail is accepted for further transmission,
"[192.161.1.101]" is simply invalid, regardless of its syntax
properties.

If one accepts even a fraction of that, then a system that
behaves this way is operating outside the spec by tolerating
such an EHLO command at all and is trying to do something
creative to note that rather messy state.   

Now, my personal preference would have been to document this a
little differently, partially because I don't like adding syntax
violations to the complex of problems that are there already.
But it is a matter of taste and a territory into which I don't
think 2821bis needs to go or should go.  I think I would prefer
that a server deal with situations like this one by inserting
something more like

  Received: (from bogus-NAT-address 192.168.1.101
      [68.215.50.125])
      by mx.google.com with ESMTP id
      h17sm2595511wxd.2007.04.27.19.07.38;
      Fri, 27 Apr 2007 19:07:38 -0700 (PDT)

or maybe, taking liberties with a different rule,

  Received: from [68.215.50.125] 
      (EHLO bogus NAT address 192.168.1.101)
      by mx.google.com with ESMTP id
      h17sm2595511wxd.2007.04.27.19.07.38;
      Fri, 27 Apr 2007 19:07:38 -0700 (PDT)

So one of the largest ISPs in the world, is altering the
Received line by using replacing the brackets for question
marks.

No, they aren't _altering_ the received line.    It is their
received line.  

Looking at this example only for the moment and ignoring the
other things you found out about Google's behavior, a
receiver-SMTP that does this could be doing something creative
with the syntax to note the fact that the smtp-Sender has
violated 2821 by issuing an invalid EHLO command (because it
uses a private address on the public Internet, not because of
syntax) and it has decided to accept and transport the mail
anyway.   I don't particularly like the syntax of their
announcement that they have done so, but the standards violation
occurs earlier and I have to approve of their attempt to
document the violation --and the situation as it appears from
their end-- carefully.

Rhetorically, why would it do this?  Why not just use the
bracketed address literal provided?

For some hypothetical system behaving that way, that could be a
reasonable explanation of an almost-reasonable behavior.  For
gmail the explanation above is obviously not the correct one,
but it is useful to separate the questions.  What they are
actually doing --given your tests -- makes them non-compliant.
But that isn't a reason to change 2821 either and hence not,
IMO, an issue for us.

Again, Rhetorically, I ask, does the google server even
support the reading of bracketed IP address-literal for
EHLO/HELO or any other address-literal provided during the
transaction?

If there are any google gmail.com lurking this list, they
might want to chime in about the above.

To help explore these question, I went ahead and performed
more test to see how it handled the EHLO/HELO.   I sent five
(5) new messages with each using different EHLO with the
following Received line results:

1) SMTP compliant bracketed address-address, IP matching

   EHLO [68.215.50.125]

      --> Received: from ?68.215.50.125?

Bug, IMO.

2) No bracketed address-address, IP matching

   EHLO 68.215.50.125

      --> Received: from 68.215.50.125

Worse bug, IMO.  But, of course, this is also a client violation.

3) Simulate what Outlook would do (sent computer name)

   EHLO hdev1

      --> Received: from hdev1

Different bug, but also a client violation.

4) Use the real/correct ptr domain

   EHLO adsl-215-50-125.mia.bellsouth.net

      -->  Received: from adsl-215-50-125.mia.bellsouth.net

behaving as expected.

5) Use the incorrect ptr domain (125 changed to 126)

   EHLO adsl-215-50-126.mia.bellsouth.net

      -->  Received: from adsl-215-50-125.mia.bellsouth.net

With no IP address in brackets?   Clearly permitted by 2821 (and
821), but probably not in particularly good taste today.   Still
not a 2821bis issue, IMO.

Based on all this I can take a SWAG that

a) Google is not supporting bracketing address-literals,

Looks like it.

b) Google replaces unexpected characters with '?' for
    the Received line address-literal field.

Yep.

c) Even though the transaction requires the strong SUBMIT 587
    protocol, it it relaxed on not enforcing correct
    address-literal syntax IP matching nor rDNS check
authentication.

Yes, but not an issue for 2821bis.  Possibly an issue for
4409bis, but I doubt it.

The latter (c) is a separate issue, but it might reflect the
importance or unimportance of the bracketed address-literals.

Or might reflect the more general possibility that someone at
gmail is asleep at the switch.

Why or how could any of this constitute for 2821bis?

Well, in our quest for clarifying and codifying existing and
possible dominant behavior,  for new implementors what I can
see occurring is they begin may use systems like GOOGLE
gmail.com as a design baseline for practical and good
2821/2822 protocol behavior. They may follow their lead,
including not expecting brackets and also converting them to
questions marks for the trace line.

Comments anyone?

This is the same problem that we had when some people were
assuming that whatever sendmail was doing was the spec, without
looking at the spec, 20 years ago.  It was a problem then, it is
a problem now... only the previous bad actor has cleaned up its
act and the new bad actor has.  It is not a problem with the
standard, IMO, unless someone can go find the relevant gmail
people and get them to tell us that their trace field
implementation ended up this way because they got confused by
the spec.  Otherwise, this is interesting, and sad, but not a
problem.

       john