I would like to submit the following as a new information internet
draft to highlight some of the pragmatic problems found in the
internet with RFC822 based mail.
I am not too sure which working group this is most suitable under at present.
Any suggestions are welcome.
Julian.
Network Working Group Julian Onions
Request for Comments: DRAFT Nexor Ltd
February 17, 1995
How to be a Bad EMail Citizen
1. Status of this Memo
This document is an Internet Draft. Internet Drafts are working
documents of the Internet Engineering Task Force (IETF), its Areas,
and its Working Groups. Note that other groups may also distribute
working documents as Internet Drafts.
Internet Drafts are valid for a maximum of six months and may be
updated, replaced, or obsoleted by other documents at any time. (The
file 1id-abstracts.txt on nic.ddn.mil describes the current status of
each Internet Draft.) It is not appropriate to use Internet Drafts as
reference material or to cite them other than as a "work in
progress".
This draft is known as draft-onions-822-mailproblems-00.
2. Abstract
The internet consists of many hosts and many implementations of each
protocol suite. There are no formal tests or approval mechanisms
associated with membership of the internet, and therefore there are
very varied levels of conformance to the various standards. This
document intends to describe some of the common problems, mistakes
and errors that are made in electronic mail. Most of them are easily
avoidable, and some guidance on what to do in each case is given
here. Some of these guidelines are pragmatic, some are mandated by
other standards, and others are religious.
3. Introduction
There are various documents around the internet that define the way
mail should behave, what is mandatory, what is optional and what is
forbidden. Adherence to these standards across implementations is at
best patchy, and with no overseeing body the only enforcement to the
standards are peer pressure and possible lack of service.
Onions Expires Aug 30, 1995 [Page 1]
INTERNET DRAFT How to be a Bad EMail Citizen February 17, 1995
4. Scope
This document restricts itself to the standards defined in RFC-821
(SMTP), RFC-822, RFC-1123 (Host Requirements), RFC-1521 (MIME) and
RFC-1651 (SMTP Extensions). Currently other documents are not
considered.
5. Issues concerning SMTP
5.1. The RSET Command
RFC-821 is not specific about exactly what the RSET verb resets.
This has apparently not been a problem in the past because of the
simplicity of the protocol. With the publication of extensions to
the SMTP protocol with additional commands and state information,
making a more precise definition desirable. The definition provided
should not constrain any existing RFC-821 implementation since it is
consistent with both the current practice and the only two plausible
interpretations.
RSET is to be interpreted by SMTP servers as clearing state
information present in a session. In particular, it eliminates the
effect of any prior FROM commands, any DATA, and any delivery
addresses. It resets the server's state to "not a mail transaction".
This implies it is in the state after the HELO and before the MAIL
verb.
RSET has been interpreted by some SMTP servers as requiring that a
new HELO command be sent after RSET is acknowledged. Other servers
assume that the previous HELO is not reset. Servers SHOULD accept a
HELO command subsequent to RSET without special comment, overriding a
previous one if necessary. Servers MUST NOT require a HELO command
after a RSET.
The description above summarizes the current situation with SMTP
implementations based on a series of experiments. No implementations
have been identified that rejects a second HELO, but it would not be
surprising to find one.
5.2. Duplication of single state verbs.
Whilst some of the SMTP state-inducing verbs may be repeated and
arbitrary number of times (such as RCPT for multi-destination) other
verbs (such as MAIL) may only be issued once per transaction. If a
second occurrence of state-inducing verb is detected, a server MAY
either accept it, overriding earlier information, or may reject it as
an out-of-sequence command with a "503 bad sequence of commands"
code. A client sending multiple of these commands within a mail
Onions Expires Aug 30, 1995 [Page 2]
INTERNET DRAFT How to be a Bad EMail Citizen February 17, 1995
transaction MUST be prepared to send a RSET and start over, or to
send QUIT and abandon the session, if 503 is received in this case.
Clients SHOULD, if possible, behave in a way that avoids this
situation.
The issues above do not arise in the normal case of multiple
successful message transmissions in the same session, since each
successful message completion (i.e., server receipt of DATA, the
message, CR LF . CR LF, and then sending a positive completion reply)
results in terminating a mail transaction. Clients SHOULD NOT send
RSET after receipt of a 250 response after DATA and the message;
servers MUST reset their states after sending that 250 response and
MUST NOT require clients to send RSET before the next MAIL FROM
command
5.3. Behavior with unrecognized verbs.
While it is not quite explicit, RFC-821 appears to expect that, if a
verb is not recognized by the receiver, it will reject the command
with a "permanent error", 5yz, code, presumably 500 (Syntax error).
Similarly, it appears to specify that, if the sender receives such a
code, it must either abandon the mail message (sending QUIT or RSET,
presumably) or do something else involving the same or a different
verb; it may not simply ignore the 5yz error code and pretend it was
a 2yz (or 354) code. This specification depends on that behavioral
model.
Consistent with RFC-821, we expect that existing SMTP servers will
reply with a return code of 500 (Syntax error) when any unfamiliar
verb is received.
The material above should probably have made it into RFC-1123, but
some of the issues -- particularly the fact that anyone could ever
have believed that anything else (such as simply ignoring 5xx codes)
was permitted--have emerged only in the process of this
investigation. Nonetheless, this clarification is believed to be
consistent with existing usage and implementations of SMTP.
5.4. Behaviour with eight-bit data
RFC-821 together with RFC-822 is unambiguous in this respect. Unless
an extension to RFC-821 is in force for the mail transaction, eight-
bit data may not be sent. Period.
This point just needs emphasising. It is present in the original
documents, but not spelled out.
Onions Expires Aug 30, 1995 [Page 3]
INTERNET DRAFT How to be a Bad EMail Citizen February 17, 1995
5.5. Error reports with eight-bit data
Some implementations will return the original message as part of a
delivery report. Care needs to be taken in this case that the reason
for failure was that eight-bit data was present. Otherwise it is
possible to construct an illegal eight-bit message as an error report
to an eight-bit message.
As error reports and messages cannot be easily distinguished in
RFC821, all messages (including error messages) appear as standard
messages, and therefore need to be correct RFC822 messages.
5.6. Rejection of SMTP connections due to DNS failure.
There are a number of SMTP implementations that either do, or can be
configured, to reject SMTP connections if the calling host is not
registered in the DNS. This is seen by some as a breaking of the
spirit of RFC-1123, and by others as a useful get-out-of-jail card.
Regardless of whether this is a good idea or a bad one, the fact
remains this is practiced by some sites. Implementors are therefore
encouraged to use back up MX routing in the case of a connection that
succeeds but no data is received before the connection is dropped.
This topic has been debated a number of times on the Internet with
both sides sticking to their views. There is no sense in continuing
to try and standardise this point. What a site will do with any
internet connection from any host eventually comes down to what the
administrator at that site decides. If they don't want to talk to a
given set of hosts, that may be their loss. With the increasing
emphasis on security though, the fact that a site advertises an MX or
A record in the DNS does not imply it will talk to all callers.
5.7. EHLO commands
There are one or two servers that respond badly to EHLO commands.
That is they either set themselves into inconsistent states, or else
drop the connection at once. The RFC is fairly clear that unknown
commands should be rejected but otherwise ignored.
A resilient server MAY detect that the EHLO caused the connection to
drop and immediately retry the connection with a HELO verb in place.
Alternatively it can be treated as a bad connection and subsequent MX
records tried if available. However servers SHOULD NOT drop the
connection in response to an unknown verb.
Onions Expires Aug 30, 1995 [Page 4]
INTERNET DRAFT How to be a Bad EMail Citizen February 17, 1995
6. RFC-822 Issues
6.1. Illegal format RFC-822 messages
Some implementations of RFC-821 check the message for adherence to
RFC-822 minimum requirements as the message is received. These are
that the message contains in the header a From field, a Date field
and a recipient field of some type. However, some user agents use
RFC-821 as a submission protocol and assume that messages will be
made legal RFC-822 as part of the submission process (as some MTA's
already do this). Implementations MAY therefore allow strictly
illegal RFC-822 messages as data and make them legal by addition of
new headers, or MAY reject the message as illegal data.
Some User Agents, particularly those on PC's find it difficult to
determine an accurate time to provide a Date field, and therefore
leave it out. It is harmless enough to insert such a field when
acting as a submission channel, but inserting a Date mid way through
a multi-hop delivery path is mis-leading and should be discouraged.
However, in practice it is difficult to determine the two modes RFC-
821 is used in, so usually a blanket decision concerning all
transfers has to be made. What is really required is a submission
protocol tailored for this sort of behaviour that can take a partial
RFC-822 message and add the appropriate envelope bits.
6.2. Received Lines
The syntax of the Received: lines in RFC-822 messages is reasonably
straight forward. It requires as a minimum a date stamp following a
semi-colon. Unfortunately some implementations cannot seem to
generate this. This can cause problems when gatewaying to other
systems that also have trace fields. This is seen as a good way to
cause general confusion when tracking messages.
When gatewaying or examining these elements, the invalid elements
should either be discarded or else the current time inserted to make
them legal. The illegal Received: lines can be changed to be Orig-
Received: to ensure the relayed message is now legal.
6.3. Date fields.
Date fields are usually fairly standard, although there are
implementations that strike out with new an novel formats. However,
when it comes to the area of time zones there is little limitation in
the imagination of implementors. Normally time zones should be
numeric as these are unambiguous. It should be down to the user agent
to display the Date in a ``pretty'' format.
Onions Expires Aug 30, 1995 [Page 5]
INTERNET DRAFT How to be a Bad EMail Citizen February 17, 1995
Just say NO to pretty, arbitrary timezones! All UAs should generate
numeric offsets for timezones.
6.4. Resent- fields
RFC-822 allows the pseudo-forwarding of messages by amending the
header of a message to contain new recipients. This is done by adding
headers such as
Resent-To: abc(_at_)domain(_dot_)name
Resent-Date: Sun, 1 Jan 1995 02:24 +0000
Resent-From: xyz(_at_)foo(_dot_)bar
It is not clear in RFC-822 if when resending a message a complete set
of headers is required. The standard would seem to imply that they
are but no grammer is present which mandates it. Therefore
implementations vary on how to treat this type of message.
Strict implementations will on detection of a Resent- field, conclude
that this is a resent message, and therefore should be using the
Resent- versions of the fields as opposed to the standard forms. In
this case a message without a Resent-From, a Resent-Date and a
Resent- recipient field is illegal. It is assumed that the message
has been resent but with only a partially correct header.
Other implementations take the view that a Resent- field is a higher
weighted form of the original field. That is, a Resent-Date should be
used in preference to a Date field, but as long as a Date, From and
Recipient field is present with or without the resent- prefix the
message is legal.
The first view treats the resent- as a new overriding SET of headers,
the second as individual replacements for fields. Either case could
be argued, as the original text is unclear.
For pragmatic reasons, and because it seems closer to the intent of
RFC-822 in this case, the Resent- fields should be taken as a set.
However implementations SHOULD allow the individual fields. In
practice this sort of forwarding is not very common, but does arise
from time to time.
7. MIME issues.
MIME since its inception has allowed implementations of MTAs and UAs
to further the cause of havoc and generally increase entropy. The
number of ways that it is possible to get this specification wrong is
truely astounding! In general an MTA can treat badly formatted MIME
as a text/plain format and punt the whole problem to the UA. The UA
Onions Expires Aug 30, 1995 [Page 6]
INTERNET DRAFT How to be a Bad EMail Citizen February 17, 1995
will take a number of views:
a) It will crash and burn.
b) It will complain the message is illegal and refuse to show it.
c) It won't care and show you the message, warts and all.
d) It will ignore the message, and you will never even know you
have received the message.
The best approach is to be able to flag an error and then revert to
action c) above. This may upset some naive mail users (who seem to be
predominantly upper management and therefore dangerous to upset!).
7.1. Badly formatted Content-Type: fields
Implementations have been known to produce lines of the form
MIME-version: 1.0
Content-Type: text
That is, a MIME type, without the mandatory subtype. This is illegal
as a MIME header and means the content may be subject to
misinterpretation.
In these cases the most pragmatic case is to treat the message as
text/plain, regardless of what the Content-type might indicate.
However, outright rejection of the message is also an option. (The
author feels a system that rejects every other such message may have
merits in forcing systems to be upgraded.)
7.2. Multiple Content-Type fields
Messages may contain multiple Content-Type fields, sometimes
containing contradictory information. Where this happens this may
again cause contents to be misrepresented, or misprocessed. For
instance:
MIME-version: 1.0
Content-Type: multipart/mixed; boundary="---"
MIME-version: 1.0
Content-Type: text/plain
As for the badly formatted contents type. If two Content-Type fields
are present, and contain the same information, that case MAY be
treated as just one Content-Type field.
Onions Expires Aug 30, 1995 [Page 7]
INTERNET DRAFT How to be a Bad EMail Citizen February 17, 1995
7.3. Badly structured multipart messages
Message that contain fields such as
Content-Type: multipart/mixed
have some great potential for causing indigestion in mail systems.
The missing boundary string means that although the message is split
into multiple parts, there is no way a process can reconstruct the
message in general.
It is charitable to believed that these type of messages start out
with good intentions, but loose their boundary markers somewhere in
flight. Whilst an intelligent human can scan the body part and make
an educated guess at what the separator is, this is not generally
possible for a program.
7.4. Wrapped lines
Another interesting little problem is where a UA, or MTA has
helpfully wrapped the text of the field to improve readability. Some
interesting examples are presented here.
Content-Type: multipart/mixed; boundary="message
-separator"
Content-Type: multipart/mixed; boundary="abcdefghijklmno:
boundary:fixed01"
The first case is debateably correct input, although few MTA/UAs will
be able to reconstruct the correct separator. The second case is
illegal, ambiguous and awkward to treat well.
Why do people do this! The road to hell is paved with good
intentions. In both cases little should be done to try and
reconstruct the message without human help.
7.5. MIME prologue and Epilogue text
A number of systems and hand constructed messages put text into the
prologue and epilogue of MIME multipart messages. Whilst this is a
neat trick for allowing non-mime UAs to inform the user why the
message appears as garbage, the prologue/epilogue does not really
exist as part of a message. Therefore when gatewaying or simply
processing such messages, these components may disappear.
Alternatively they may appear as new body parts after transformation.
Therefore whilst you can do it, don't be suprised if it fails to
appear at the other end.
Onions Expires Aug 30, 1995 [Page 8]
INTERNET DRAFT How to be a Bad EMail Citizen February 17, 1995
8. Acknowledgements
This document represents a collection of the experiences and hard-won
battle scars from a large community of people. All implementors of
SMTP mail systems will have had some influence on this document.
In particular there are a number of points taken from the work done
in the smtp extensions working group. This document is a summary of
some of the discussions, and other experiences. Some of this text is
taken from an earlier draft of the SMTP working group document.
9. Security Considerations
Security considerations are not discussed in this memo.
10. Editor's Address
Julian Onions <j(_dot_)onions(_at_)nexor(_dot_)co(_dot_)uk>
Nexor Ltd.
PO Box 132,
Nottingham, England.
Onions Expires Aug 30, 1995 [Page 9]