ietf-822
[Top] [All Lists]

SVr4 mail and RFC-XXXX

1991-05-26 21:21:09
I haven't been watching this list for too long (it's overwhelming how much
traffic there is on it). Let me describe how the problem has been attacked
in System V Release 4 (SVr4) and beyond. For those of you not familiar with
SVr4 mail, it is a fully 8-bit transparent, content independent mail system.
Any 8-bit message can be transmitted using mail, including arbitrary data
files. (I know of people who regularly mail spreadsheets and even a.out
executables around without any problem.) Most of the description below will
actually deal with the mail system which comes in the release beyond SVr4,
known as System V Release 4.0 Enhanced Security (SVr4ES). That mail system
is evolved just slightly beyond what is in SVr4.

In addition to describing SVr4 mail, I'll also talk briefly about mail as
handled by AT&T Mail, a commercial service which has been handling content
transparent mail for five to six years now, followed by comments on the
proposed RFC-XXXX's.

In summary, the items to be addressed in this mail message are:

    o   SVr4/SVr4ES rmail (the Transport Agent)
    o   SVr4/SVr4ES mail and mailx (the Mail User Agents)
    o   SVr4/SVr4ES SMTP support (one of several transport protocol Deliver 
Agents)
    o   AT&T Mail
    o   RFC-XXXX

----------------------------------------------------------------
SVr4/SVr4ES rmail (the Transport Agent)

The program /usr/bin/rmail can be considered as simply a transport agent,
which in turn converses with transport protocol delivery agents to do its
work. All rmail does is:

    o   do address conversions and validations
    o   invoke transport delivery agents to deliver mail
    o   deliver mail locally

Examples of transport delivery agents are uux and smtpqer. Uux will deliver
mail to a remote system known by the UUCP system. Smtpqer will deliver mail
to a remote system known to talk the SMTP protocol.

The mail message that rmail delivers consists of the following:

        A series of UNIX From and >From headers
        A series of RFC-822 style "Name: value" headers
        A blank line
        The message body

One of the RFC-822 headers which rmail always puts out is the
Content-Length: header. This header has a single number following it, which
is the length in bytes of the message body.

The body in turn can then be any content whatsoever. It doesn't matter to
rmail what the body contains; rmail will pass it through intact. (In fact,
the RFC-822 style headers can also contain any character except the NUL
byte.)

If one doesn't already exist, rmail will also add in a Content-Type: header
which will have one of three values:

        Text            the message body consists only of USASCII printable 
bytes
        Generic-Text    the message body consists locale-specific printable 
bytes
        Binary          the message body contains bytes not considered printable

(SVr4ES has all three. SVr4 only used Text and Binary.)

Note that I use the term "locale" above. System V uses the ANSI C concept of
locales embodied in what is known as Enhanced Unix Codesets (EUC), which
permit non-ASCII characters to be represented by using multibyte characters.
Actually, the use of EUC is irrelevant, as knowledge of printability of
everything is embodied in the ANSI C tables associated with the current
locale. The rmail command just uses the <wchar.h> iswprint() function to
determine if a multibyte character is printable or not. Iswprint() in turn
is initialized with tables corresponding to the current locale. Different
people on the same system can be using different locales and iswprint()
always returns true and false according to their current locale.

When a mail message is determined to be Generic-Text, rmail also puts out an
Encoding-Type: header with the contents "euc/locale=...". This passes on to
the receiving end information indicating the locale that the message was
created in. Other information could conceivably also be placed on
Encoding-Type: headers. The Encoding-Type: header is certainly not wedded to
EUC.

The transport protocol delivery agents are given the messages for delivery;
it is up to the engine as to whether or not to accept the message. If all
transport delivery agents reject the message and it can't be delivered
locally, then the message will be sent back to the originator along with
information from the last transport delivery agent to look at it as to why
it was rejected.

Note that the KISS (Keep It Simple S*) principle applies, and does quite
well. The transport layer knows as little as possible about the message. In
turn, it passes on to the receiving end what information it can about how
the message was constructed.

----------------------------------------------------------------
SVr4/SVr4ES mail and mailx (the Mail User Agents)

The /usr/bin/mail and /usr/bin/mailx programs are Mail User Agents whose
responsibility is to present a mail message in an appropriate manner. Their
first task upon seeing a mail message is to determine what type it is: text,
generic-text or binary. They do this by doing their own scan of the message
body; they do not believe anything that the headers (other than
content-length) say. They do, however, note what locale is listed in the
Encoding-Type: header, if there is one.

When a mail message is to be printed, there are several possibilities:

    o   the message is 7-bit ASCII text
    o   the message can be considered text according to the current locale
            o   and the locales match
            o   but the locales differ
    o   the message is binary

If the message is 7-bit ASCII text, or if the message is text according to
the current locale and the locales match, it is printed. If it could be
printed according to the current locale, but the locales don't match, a
message to that effect is printed and the user is then given a choice as to
whether or not to print the message. The user can skip printing the message,
or possibly even change locales to match. If the text is binary, then a
different message is printed.

In all cases, it is the Mail User Agent which deals with how the message is
printed, not the transport agent, transport media or transport delivery
agent. Also, whether a message is printable is based strictly on the
receiving user's environment, with suitable information passed along to help
the receiver to decide if they are in the proper environment to interpret
the message.

(In SVr4, rmail doesn't look at the locale information. It just looks for
text or binary messages.)

Once again, KISS applies, and does quite well.

----------------------------------------------------------------
SVr4/SVr4ES SMTP support

So how does the SVr4 SMTP system deal with these binary messages? Simple: it
doesn't. If the message body matches what can be transported legally
according to the appropriate RFC's, then the SMTP transport delivery agent
(smtpqer) will accept the message for transport. If the message isn't legal
text, smtpqer won't accept it. Header transformations may also be performed
as necessary to convert to be RFC-822 compliant. In other words, nothing
will escape onto the Internet that the Internet can't handle.

If smtpqer is the last transport delivery agent to reject the message, the
message sent back by rmail will indicate that the message contained a binary
content.

----------------------------------------------------------------
AT&T Mail

So what does AT&T Mail (and AT&T's PMX product line of mail products) have
to do with 8-bit mail? The answer is that it does just about everything
described above, but in addition knows about such things as multiple-part
mail messages. When there are multiple body parts present, Content-Type is
set to "MultiPart".

Within the body, there are further "headers" for each part which include
minimally Content-Type: and Content-Length: headers for the part. The header
section is then followed by the part's body, which will be exactly the given
length long. Other headers give additional optional information, such as the
encryption method and a description.

For example, the following message contains a multipart message consisting
of a text portion, an encrypted portion, and a spreadsheet.

        From: ...
        To: ...
        Cc: ...
        Other-Headers: ...
        Content-Type: Multipart
        Content-Length: 7208

        Content-Type: Text
        Content-Name: /etc/group
        Content-Abstract: group
        Content-Length: 814

        ...body: the contents of /etc/group, 814 bytes long
        Encrypted: crypt
        Content-Type: Text
        Content-Length: 15

        ...body: an encrypted text message 15 bytes long
        Content-Type: Binary
        Content-Name: /home/xyz/1stquarter
        Content-Abstract: 123 spreadsheet
        Content-Length: 6116

        ...body: the spreadsheet image for /home/xyz/1stquarter, 6116 bytes long

Numerous other options are also supported, which are somewhat outside of the
scope of these RFC's.

----------------------------------------------------------------
RFC-XXXX

So how does this information affect the proposed RFC-XXXX's (both the
extended RFC-822 and SMTP proposals)? I know that SVr4ES isn't available
from any vendors yet, but SVr4 mail has been around for at least 2 years.
AT&T Mail has been handling thousands of binary messages daily for 5-6
years. The technology being used is not new; it is simple and has been
proven through actual use.

I think it would be a shame if mail on the Internet weren't as capable as
mail between SVr4ES systems, and weren't as capable as mail on at least one
commercial mail system. It would also be a shame if the modifications being
proposed for the Internet weren't at least somewhat compatible with SVr4ES
mail. 

I think the proposals for the current RFC-XXXX's will bring the Internet up
to the 1980's as far as mail goes. However, it will do NOTHING for the mail
of the 1990 and 2000's, which will require multiple parts, mixed binary and
text, and multiple types and attachments mixed together. How do you plan on
sending mail with an embedded voice and video image? Those types of messages
are being sent right now; not being able to handle them is silly. Such
messages certainly won't flow across the Internet with the current RFC-XXXX
proposals. 

                                        Tony Hansen
                            hansen(_at_)pegasus(_dot_)att(_dot_)com, 
tony(_at_)attmail(_dot_)com
                                att!pegasus!hansen, attmail!tony


<Prev in Thread] Current Thread [Next in Thread>