Re: How to handle a lot of character set Content-types

Bob Smart writes, in part....

The users might like to get a new UA that
understands that, or they might like to have the message pre-processed
so that they can keep using their existing UA. There are various styles
in which this might be done and you might like to consider which if any
are acceptable.


I think much confusion is caused by attempts to project the OSI/X4.00 
model and terminology onto the Internet SMTP environment, which was 
designed around a very different model--different in its layering and 
different in how far it presumes to reach into the local environment.

The original internet/smtp model:
  user-mail-writing-program -> sending-UA               (1)
   ->                                                   (2)
  receiving-server -> user-mail-reading program         (3)

   Note no sending MTA and no receiving UA.  And no joining together of 
the user-mail-reader/writer and the UA, either, which is an important 
reality in how we think about these things and how users think about 
them, even if that interface is, if you will, at the subroutine level.

The current one
  user-mail-writer-> sending-UA -> sending-MTA           (1)
  ->   { relay-MTA | gateway-MTA .... } ->               (2)
  receiving-MTA -> receiving-UA -> user-mail-reader      (3)

{  Aside: there are several reasons for making the UA/mail-reader 
distinction.  One of them is that it is extremely useful in thinking 
about the authority and behavior of things that look like individual 
mailboxes but are really mailing lists.  If you think about them *as* 
individual mailboxes under the control/authority of a UA which is, in 
turn, acting as agent for a list administrator who could be sitting 
there typing "forward" or "resend", you avoid a lot of tedious 
discussions about what "else" you will let an MTA do if you let the MTA 
do "that".  Others involve having one functional interface and several 
user interfaces. This is why models are important.  But, IMHO, it
doesn't belong on this list, at least not now, unless it actively
intrudes on the network transport or format of mail.  } 

Now we have very carefully avoided, up to this point, writing any 
specifications at all about what happens intra-(1) or intra-(3).  
Partially that has been a matter of how Internet protocols have always 
been written.  But it has also been one of the things that has permitted 
the transition between those two models without massive disruption.  
Until something moves into the network (2), the RFCs are silent and as 
soon as something moves out of the network, they become silent again.

And that--quite independent of anything in OSI--has led to two implicit 
doctrines which amount to maximum transparancy in the network (2), and a 
"what you do for (or to) your users is between you and them" theory 
about (1) and (3).  It is also the case that, since we don't specify the 
intra-host relationships, there are probably a dozen alternative models 
for (1) and (3) that are functionally equivalent as far as the network 
(and RFCs) are concerned.  For example, I've found it useful in some 
contexts (both explanation and implementation) to talk about
  receiving-MTA -> mail-delivery-agent -> receiving-UA -> ...

These distinctions make it locally possible to be clear about who is 
doing what to whom, to better model access of system administrators to 
mail functions and text, and so on.  But, from the standpoint of the 
RFCs, all of this is just guidance:  what happens intra-host may be 
incredibly stupid, but it cannot attain the status of "broken" until it 
crosses into the network.

I think it is as important to preserve the principle of "RFCs, keep your 
requirements off what happens inside my host" is as important as trying 
to preserve principles about--and be extremely careful about--the 
transitions/ transformations that are permitted to occur in the network 
transport process ((2) above).

{ Example: many Internet hosts have discovered that users prefer to 
type "remote-user(_at_)foo(_dot_)bitnet" to typing, e.g., 
remote-user%foo(_dot_)bitnet(_at_)gatewaydomain(_dot_)  Many do the rewrite 
automatically.  Some of us consider it stupid to give into this because 
it has nasty effects in terms of what users think their addresses are, 
ability to preserve addressing modes as users switch hosts, etc.  But 
that is just a matter of opinion and personal bias (religion, if you 
prefer) as long as all the rewrites work correctly and no message that 
contains either a source or destination address in a form that makes 
".bitnet" look like a top-level domain.  If it *does* escape, then the 
mail system on the sending host is broken and needs to be fixed-- 
whether by "correcting" user behavior or "correcting" the rewrite 
environment is, again, a local problem. }

While, IMHO, the questions of intra-host relationships are extremely 
interesting, the implication of the above is that raising them on this 
list just creates confusion.  My jumping on the issue of rewriting 
sender addresses a few weeks ago was, to some degree, a manifestation of 
that confusion: I thought the discussion was about relays, and responded 
on that basis.  For the final destination MTA: mostly your problem.  The 
qualifications "to some degree" and "mostly" are deliberate: it would be 
possible, in the above model, to write a rule that said "senders MUST 
send only validated addresses, receivers SHOULD verify that those 
addresses are valid and should reject messages that are don't contain 
value sender information".  But, regardless of the technical and moral 
problems with such a rule, the point is that we don't have one.

So, please... we really have enough real problems in mail formats and 
mail transport, let's take these local-host issues and speculations 
elsewhere.
    --john