ietf-822
[Top] [All Lists]

Re: MTS transparency and anonymity

2005-02-27 19:04:08

On Sun February 27 2005 12:53, Keith Moore wrote:

if anonymous(_at_)[] doesn't quite work, there are similar alternatives
worth considering.  among them are: 
anonymous(_at_)[0(_dot_)0(_dot_)0(_dot_)0],

No, see RFC 3330; that's a local net.

No, see RFC 3330; it's only intended for use as a _source_ address.  
Trying to connect to the address fails on every platform I know of.

Well, we're talking about a mailbox in the From field, which has
semantics of a source if anything.  0.0.0.0/32 means "this" host on
"this" network, and has the same issues as 127.0.0.1.  "Trying to
connect to" has semantics of a destination, which is the opposite
of those of the From field.

note that there are a lot fewer TCP/IP stacks in wide use than there 
are mail-handling programs.  it's much easier to be confident about the 
behavior of an IP stack.

Maybe. I recall early implementations where 0.0.0.0 is a
broadcast address; not all of those implementations have vanished.
But that's really a digression from the issue, because no IPv4
address has semantics of either "anonymous" or "not on the
Internet" -- to those familiar with IP addresses, much less to
casual users to whom such hypothetical IP address literals would
be presented.  Trying to encode such semantics into a numeric
domain literal seem positively user-hostile.
 
anonymous(_at_)[127(_dot_)0(_dot_)0(_dot_)1],

Absolutely not; that's (the recipient's) local host, and there
might well be a mailbox named "anonymous" (or any other legal
local-part) there.

so use an invalid address, rather than a loopback address.

I mentioned RFC 3330 because I looked through it in search of
just such an invalid (IP) address, and could find none guaranteed
to remain invalid.  And I looked rather closely at 0.0.0.0 and
127.0.0.1.

also, if it happens that a dns query for example.com yields ip address 
X, it should not be assumed that user(_at_)example(_dot_)com is equivalent to 
user(_at_)[X](_dot_)

That's not the problem, nor even related to the problem.  The problem
is that there may be an SMTP (or other mail protocol) server listening
on the loopback address, and any local-part might be a valid one on
some such system.  That is, given user(_at_)[X], X may be interpretable as
a local source and "user" may be a valid mailbox within that domain
(using "domain" broadly), and has nothing to do with any domain names,
"example.com" or otherwise.  Rather the issue is that some values of
X, including but not limited to 0.0.0.0 and 127.0.0.1, have the
semantics of "here" regardless what domain names or other IP address
literals also map to either a general or specific instance of "here".

anonymous(_at_)[127(_dot_)255(_dot_)255(_dot_)255],

No, also covered by RFC 3330.

and if the host correctly implements RFC 3330 ("no addresses within 
this block should ever appear on any network anywhere")

Yes, the loopback connection is supposed to be internal to the IP
implementation, and not be translated to any physical network.  That
still doesn't have semantics of "anonymous" or of "nonexistent"; it
has semantics of "here".

then this will  
work just fine, as the host IP stack will return an error when the MTA 
tries to connect to that address.

You again seem to be focusing on non-replyability, which isn't the
topic of draft-lilly-from-optional.  Now it's true that one cannot
reply to an anonymous source (otherwise the source isn't really
anonymous), nor directly to a source unreachable via a reverse path,
but there are other circumstances that might lead to non-replyability
(e.g. transcription errors during transport), so non-replyability
does not conclusively indicate either anonymity or lack of an Internet
mailbox.  I.e. non-replyability is a symptom, not an indicator.

anonymous(_at_)[::0],
and anonymous(_at_)[::1].

Not valid RFC 2821 syntax.

sounds like a bug in RFC 2821.

You're certainly free to make a case for that. The legal 2821 versions
of IPv6 domain literals would be something like [ipv6:::0]. Yes, it's
ugly. 

Incidentally, there is an inconsistency between 2822 and 2821; 2822
ABNF allows empty square brackets for the domain literal, and refers in
the normative text to 2821 for details, which does not permit an empty
literal -- it permits untagged IPv4 dotted decimal quads and tagged
literals (IPv6 and "General").  It appears that the ABNF in 2822 is
overly broad and does not correspond to the normative text referring to
2821s treatment of domain literals; it should probably incorporate
the 2821 tagging mechanism (assuming that that is deemed appropriate;
I haven't seen it used extensively).

but if the MTA strictly implements RFC  
2821 and rejects the message at RCPT time, that's perfectly okay.

OK for what? For preventing a response (assuming that the message
gets to a recipient)?  Again, that's not the subject of the draft.

it  
doesn't really matter whether the MTA rejects the address because the 
local-part doesn't exist or because the domain has an invalid syntax, 
as long as it fails.

Are you discussing transport, or the message format? The draft deals
with the latter.

Two problems:
1. the only things suitable for "whatever" have problems as
   detailed separately;

see above.  I think you're trying too hard to find problems with 
alternatives to your proposal while failing to recognize the problems 
with yours.

Not at all. We discussed syntax options months ago, and your
earlier message prompted a detailed analysis of issues (it was
clear that an empty literal wasn't permitted by 822, but I needed
to double-check 2822 and the references cited in 822 and 2822).

So far as I can see, there is no perfectly clean solution to the
issues covered in the draft.  However, simply making the From
field optional -- while that may present some issues -- seems to
have the least negative impact of the options considered (including
syntax changes and extension, which present rather more serious
backwards compatibility issues than making the field optional).

A syntax change to permit "From: <>" or named group syntax "From:
phrase: (possibly empty) list;" would have the desired semantics,
but appears to have more serious backwards compatibility issues
than making the field optional; we have both identified those forms
as syntax errors (under rules in effect for most of the history
of electronic mail; they were briefly quasi-legal in the presence
of a Sender field under RFC 733 (1977-1982) [*], but was illegal
before RFC 733 and has been again illegal since RFC 822).  If there
seems to be a strong consensus that flip-flopping yet again on the
legality of such syntax is desirable, I don't mind revising the
draft to indicate that as the proposed amendment to 822/2822, but
I think it warrants rather close examination regarding the impact
on existing UAs, MSAs, and gateways.  "<>" is legal in a Return-Path
field, and named groups are legal in recipient address fields (but
not From or Sender), so presumably there would not be insuperable
difficulties for new designs; those constructs need to be parseable
in their respective fields, so could be made parseable in the From
field by developers.  Likewise, any still-used RFC 733-compliant
parsers can still handle the syntax.  However, the concern is
about designs from the past 22 years, where the From field has not
permitted those constructs.  The question is whether the nature
and quantity of problems that would be caused by (re-)introducing
such syntax is of greater or lesser magnitude that those resulting
from making the From field optional.  I don't have a definitive
answer, and as you noted, "[t]here's no way we can expect to
survey all of that software"; making the field optional for the
Internet Message Format seems to have relatively limited impact.
Changing the field syntax has rather far-reaching implications;
HTTP (RFC 2616 section 14.22) also has a From field based on RFC
822, as does SIP (SIP, RFC 3261, has some provision for anonymity,
but it is the sort of hack that the draft under discussion seeks
to provide a reasonable alternative to for the Internet Message
Format).  Any amendment to RFC 822 syntax would affect those
protocols as well, and to date neither the HTTP nor SIP communities
have been involved in this discussion (precisely because the
discussion has been focused on the Internet Message Format with
a proposed amendment to that format that does not impinge upon
those separate protocols).  I am unsure of the implications for
such syntax changes for those protocols; however both HTTP and
SIP are relative newcomers to the Internet protocols and deal
primarily with ephemeral connections -- it is highly unlikely that
there are HTTP or SIP implementations that were designed to handle
the richer RFC 733 syntax.

In addition to the effect on HTTP and SIP, a syntax change would
also necessitate an amendment to the RFC 2821 gateway requirement
discussed earlier today.

To date I haven't heard any comments specifically regarding any
implications for HTTP and/or SIP (probably for the reasons indicated
above), and I would be uncomfortable proposing a syntax change
without input from the HTTP and SIP communities.  So if you are
or know of an HTTP or SIP expert, I'd appreciate comments about
the implications of syntax changes to the From field, either on
the ietf-822 list or off-list.

2. peeking at the local-part is a layering violation unless
   the domain is yours.

for MTAs and UAs, yes.  not for humans.

If there's something there indicating either a domain or a local-part
within a domain, it fails to provide the desired semantics, viz.
nothing (i.e. no mailbox).  That itself is not a showstopper, but in
conjunction with the requirements for gateways, it presents some
obstacles.

------
* RFC 733 syntax excerpts and analysis:

originator-fields =
               (  "From"     ":" mailbox    ; Single author
                 ["Reply-To" ":" #address] )
            /  (  "From"     ":" 1#address  ; Multiple authors &
                  "Sender"   ":" mailbox    ;  may have non-mach-
                 ["Reply-To" ":" #address] );  ine addresses

address     =  host-phrase                  ; Machine mailbox
            / ( [phrase] "<" #address ">")  ; Individual / List
            / ( [phrase] ":" #address ";")  ; Group
            /  quoted-string                ; Arbitrary text
            / (":" ( "Include"              ; File, w/ addr list
                   / "Postal"               ; (U.S.) Postal addr
                   /  atom )                ; Extended data type
               ":" address)

mailbox     =  host-phrase /  (phrase mach-id)

mach-id     =  "<" host-phrase ">"          ; Contents must never
                                            ;  be modified!

host-phrase =  phrase  host-indicator       ; Basic address

host-indicator =  1*( ("at" / "@") node )   ; Right-most node is
                                            ;  at top of network
                                            ;  hierarchy; left-
                                            ;  most must be host

node        =  word / 1*DIGIT               ; Official host or
                                            ;  network name or
                                            ;  decimal address
----------
mailbox is NG because it requires at least a phrase, "@" (or "at"), and
a node; mailbox is required for the single-author version From field.
#address means zero or more "address" constructs in a comma-separated
list.
address itself has as one option [phrase] "<" #address ">"
therefore with a zero count, a legal address is [phrase] "<" ">"
That is allowable as the "Multiple authors" From field case if a
Sender field with a mailbox is provided.
Another address option is [phrase] ":" #address ";" which permitted
an empty (possibly un-)named group (also nested lists and groups).