Re: Input on identities

HELO checking would entail changes in receiving and sending MTAs for 
using a TLD as a HELO parameter; and in the receiving MTA for checking 
that parameter. This might be mitigated by already existing procedure to 
use the machine's name but this would entail publishing records for the 
subdomain if the FQDN is a subdomain. If filtering is done past the 
initial MTA (SpamAssasin for ex.), than some form of "received" header 
parsing would be needed to extract the HELO prompt which would not be 
foolproof; IF MARID checking is desired at that layer.


Very true: easy and lightweight, but in my estimation, not as effective
as MAIL FROM would be.

Benefits:

  * All machines would use a valid domain to send mail
  * Verifiable that it is an authorized MTA for the domain it claims to
    be

Possible cons:

  * No identity as related to the message itself -- such a system would
    make a very weak foundation to base other checks on, unless one were
    restrict MAIL FROM or From: to be part of the same domain as the
    HELO, which is, I think, completely unreasonable.

MAIL FROM checking would require changes to support some form of 
rewriting or aliasing if the message is forwarded with the same bounce 
address, and an ability to relay the bounce messages back to the sender 
(unless each site has its own bounce address). This will raise the issue 
of spammers guessing the bounce format and using it to pass along spam, 
and there are proposals that address it. Of course the more complex the 
rewriting scheme gets, the more changes will be required to implement 
it. Filtering software past the initial MTA including the MUA (like in 
CID) would need to start parsing "Received" headers and check them 
against the "Return-Path" header, if MARID checking is desired on that 
layer. I don't see any changes in MUAs or MDAs, or MSAs, other than the 
need to make sure that the bounce address is configured properly on the 
sender's side.


In experimenting with SPF, the changes that had to be made were
significant, but surprisingly easy. Given that the proposal has been
stable for a single-digit number of months, the quality of code that was
produced to support the effort was incredible. 

I believe that MAIL FROM is the simplest solution that will give enough
information to base further proposals on: In accepting responsibility
for bounce traffic, a sending system takes on enough burden that if the
MAIL FROM validity check passes, it's likely it's not faked. MAIL FROM
then becomes a lightly-validated identity of the responsible
/organization/ for the mail -- not the actual sender, but a single
identifiable entity to seek more information from.

The downsides to MAIL FROM have some upsides, too: By breaking
forwarding and arbitrary choice of relays, there are more likely to be
logs of the sent mail somewhere: If Joe Traveller sends mail out via the
hotel's MTA currently, tracking down a log that it was sent is nearly
impossible -- there's little record of who sent it, and the hotel
certainly won't claim responsibility. Forcing Joe Traveller to use port
587 and to "phone home" means that there will likely be a log.

MAIL FROM makes trust and reputation checks possible as an add-on, too,
with more surety than HELO checking: With HELO, you can only validate
the company that runs the MTA (unless it is set up to HELO based on
which sender it is handling messages for at that instant -- which then
looks remarkably like making HELO a duplicate of MAIL FROM), and not the
sending domain itself:

  If an ISP has 1000 domains it handles mail for, and an average of 100
  users per domain, then:

  * If you validate HELO, then you've narrowed the possible senders down
    to one in 100,000
  * If you validate MAIL FROM, then you've narrowed the possible senders
    down to one in 100 (on average)
  * If you validate the From header (and company), then you've narrowed
    it down to -the- sender -- however, the people responsible for the
    domain or the ISP are more likely to be of help, so I think MAIL
    FROM is a good tradeoff.

With RFC2822 checking, the MUAs have to be changed to enforce the 
"From"/"Sender" header values as non-forged and to support checking by 
the receiver. MTAs will have to be changed to check these headers at the 
DATA command stage which would entail more processing since the entire 
message will be received before processed. HOWEVER, there is a catch 
here - MUA or filter level checking is not as foolproof since there is 
no 100% to know the IP address of the MTA accurately. On the other hand, 
MTA software under normal circumstances does not read the data inside 
the message itself after the DATA command.


What worries me most about From: checking without another layer
underneath is that I think new headers will have to be invented, or many
more semantics loaded on existing ones in regards to mailing lists:

  * Should From: joe(_at_)example(_dot_)com with Sender: 
joe(_at_)example(_dot_)org be
    accepted?
  * Should From: joe(_at_)example(_dot_)com with Responsible-Mailing-List: or
    Sender: ietf-mxcomp-owner(_at_)imc(_dot_)org be accepted?

There's many many more variations on these scenarios, and I don't think
there's a great solution to them. There's no easy out for a list-handler
like there is with MAIL FROM -- if a mailing list wants to send mail on
behalf of someone else with MAIL FROM checking, they just have to accept
responsibility for bounce handling -- which is reasonable, since they
probably want to know anyway. With Greeting Card sites, I think their
services would be enhanced (and they would be less often rejected as
spam) if they handled and relayed the bounce messages back to the
sender, instead of being a purely send-only service.
  
MUAs will have to be changed to support displaying the validated info,
likely, to have a benefit from From: checking -- as seen in Outlook
2003, one might see:

  From: joe(_at_)example(_dot_)com, as relayed by the list 
ietf-mxcomp(_at_)imc(_dot_)org

A good thing to see, though I think nearly as much benefit would be had
from a MAIL FROM check:

  From: joe(_at_)example(_dot_)com, as sent by imc.org

and that would have far less deployment effort needed.

With in-addr.arpa, no changes are needed at the sender's MTA or MUA. The 
only changes that are needed is the receiving MTA which will evaluate 
the information at connection time just like RBL info is used today by 
some systems.


What worries me the most about this sort of system is that it again
makes two classes of citizens on the net: Those who have end-to-end IP
connectivity, and those who have restrictions placed on what services
they may run.

It is also fairly limited in scope, since if an ISP wishes to control
what IPs may send mail, then they simply have to configure their routers
appropriately. (I'm aware that some brands of router make this difficult
or more resource intense than it need be, but that's a software issue
that's relatively easily remedied if a need is seen.)

Of all of these, in-addr.arpa and HELO requires the least change, MAIL 
FROM checking requires change on MTA level to support rewriting and 
MSA/MUA level to constrict the bounce address; and header checking 
requires changes on either MUA and MTA levels to support checking, and 
MUA/MSA/MTA to constrict the address.


Agreed -- as someone said in a footing on the SPF-Discuss list a few
days ago:

  "Easily deployable", "effective" and "doesn't break anything". Pick
  two.

There is also a side issue of the HELO checking vs. the domains from 
which it is sending email. A single SMTP transaction may process more 
than one message within a single HELO prompt with the MAIL FROM values 
for multiple domains. Therefore, it might be impossible to determine 
whether network B has authorized the MTA when it is using network's A 
domain name in HELO (for which it is authorized), while in fact it may 
be relaying email from network B within that transaction.


I think that stems from trying to give to HELO some MAIL FROM-like
semantics.

For MAIL FROM checking, the configuration complexity is higher since 
MTAs must be able to check and rewrite bounce addresses. However, in 
regards to keeping information in sync between DNS and MTA configuration 
it is probably the same as HELO checking. There is also an issue with 
the need to change configuration whenever an MTA changes IPs.


Absolutely -- though if MTAs are listed by name or by reference to the
MX record, in most cases that's not an issue. You have to set up DNS for
a new IP anyway, so if the MARID record just mentions another DNS
record, no change there is neccesary.

In "from"/"Sender" checking there are configuration details on the 
sender's MUA and MSA to constrict the headers to legit values, and to 
properly check the incoming messages on the receiver's side. These 
sounds to me the same as MAIL FROM when we need to restrict "return path".


Same here.

For in-addr.arpa there is only one entity being administered unlike the 
other identities, and only one piece of software that needs to be 
configured (the DNS server). However, since the rDNS zone is poorly used 
by ISPs, the administrator might have a harder time gaining access to 
the rDNS configuration then with other identities which use forward DNS.


Absolutely. In my experience with sub-/24 networks, the rDNS is very
often broken.

5) Considerations for use in both IPv4 and IPv6.


I am not an IPv6 expert but offhand there will probably be an issue with
recording IPv6 addresses in MARID records due to the UDP 512 size limit,
especially if a text format is used like SPF. This would be an issue for
all identities except .arpa (if mapped to rDNS). Otherwise they all look 
pretty much the same to me.


Yes-- the text record and syntax of SPF are its biggest weak points. I
think the semantics are good, though.

In regards to IPv6, the SPF referencing ptr record validation and/or mx
records helped the query size limit a lot. I deployed SPF on a test
basis in a medium-sized IPv6 network, and encountered no IPv6 related
problems.

Using DNS labels wherever possible instead of referencing IP addresses
makes transition to IPv6 nearly transparent from the perspective of a
MARID proposal. If the MARID spec references IPv4 addresses explicitly
(as in .in-addr.arpa), configuring IPv6 in addition is double the work
for dual-stack networks.

Ari