Internet draft draft-onions-822-mailproblems-00.txt

I would like to submit the following as a new information internet
draft to highlight some of the pragmatic problems found in the 
internet with RFC822 based mail.

I am not too sure which working group this is most suitable under at present.
Any suggestions are welcome.

Julian.







Network Working Group                                      Julian Onions
Request for Comments: DRAFT                                    Nexor Ltd
                                                       February 17, 1995


                     How to be a Bad EMail Citizen



1. Status of this Memo

   This document is an Internet  Draft.   Internet  Drafts  are  working
   documents  of  the Internet Engineering Task Force (IETF), its Areas,
   and its Working Groups.  Note that other groups may  also  distribute
   working documents as Internet Drafts.

   Internet Drafts are valid for a maximum of  six  months  and  may  be
   updated, replaced, or obsoleted by other documents at any time.  (The
   file 1id-abstracts.txt on nic.ddn.mil describes the current status of
   each Internet Draft.) It is not appropriate to use Internet Drafts as
   reference material  or  to  cite  them  other  than  as  a  "work  in
   progress".

   This draft is known as draft-onions-822-mailproblems-00.

2. Abstract

   The internet consists of many hosts and many implementations of  each
   protocol  suite.  There  are  no  formal tests or approval mechanisms
   associated with membership of the internet, and therefore  there  are
   very  varied  levels  of  conformance  to the various standards. This
   document intends to describe some of the  common  problems,  mistakes
   and  errors that are made in electronic mail. Most of them are easily
   avoidable, and some guidance on what to do  in  each  case  is  given
   here.  Some  of  these guidelines are pragmatic, some are mandated by
   other standards, and others are religious.

3. Introduction

   There are various documents around the internet that define  the  way
   mail  should  behave, what is mandatory, what is optional and what is
   forbidden. Adherence to these standards across implementations is  at
   best  patchy, and with no overseeing body the only enforcement to the
   standards are peer pressure and possible lack of service.







Onions                    Expires Aug 30, 1995              [Page 1]





INTERNET DRAFT       How to be a Bad EMail Citizen     February 17, 1995


4. Scope

   This document restricts itself to the standards  defined  in  RFC-821
   (SMTP),  RFC-822,  RFC-1123  (Host Requirements), RFC-1521 (MIME) and
   RFC-1651  (SMTP  Extensions).  Currently  other  documents  are   not
   considered.

5. Issues concerning SMTP

5.1. The RSET Command

   RFC-821 is not specific about exactly  what  the  RSET  verb  resets.
   This  has  apparently  not  been a problem in the past because of the
   simplicity of the protocol.  With the publication  of  extensions  to
   the  SMTP  protocol  with  additional commands and state information,
   making a more precise definition desirable.  The definition  provided
   should  not constrain any existing RFC-821 implementation since it is
   consistent with both the current practice and the only two  plausible
   interpretations.

   RSET  is  to  be  interpreted  by  SMTP  servers  as  clearing  state
   information  present  in a session.  In particular, it eliminates the
   effect of any  prior  FROM  commands,  any  DATA,  and  any  delivery
   addresses.  It resets the server's state to "not a mail transaction".
   This implies it is in the state after the HELO and  before  the  MAIL
   verb.

   RSET has been interpreted by some SMTP servers as  requiring  that  a
   new  HELO  command be sent after RSET is acknowledged.  Other servers
   assume that the previous HELO is not reset.  Servers SHOULD accept  a
   HELO command subsequent to RSET without special comment, overriding a
   previous one if necessary.  Servers MUST NOT require a  HELO  command
   after a RSET.

   The description above summarizes  the  current  situation  with  SMTP
   implementations based on a series of experiments.  No implementations
   have been identified that rejects a second HELO, but it would not  be
   surprising to find one.

5.2. Duplication of single state verbs.

   Whilst some of the SMTP state-inducing  verbs  may  be  repeated  and
   arbitrary  number of times (such as RCPT for multi-destination) other
   verbs (such as MAIL) may only be issued once per  transaction.  If  a
   second  occurrence  of  state-inducing verb is detected, a server MAY
   either accept it, overriding earlier information, or may reject it as
   an  out-of-sequence  command  with  a  "503 bad sequence of commands"
   code.  A client sending multiple of  these  commands  within  a  mail



Onions                    Expires Aug 30, 1995              [Page 2]





INTERNET DRAFT       How to be a Bad EMail Citizen     February 17, 1995


   transaction  MUST  be  prepared  to send a RSET and start over, or to
   send QUIT and abandon the session, if 503 is received in  this  case.
   Clients  SHOULD,  if  possible,  behave  in  a  way  that avoids this
   situation.

   The issues above  do  not  arise  in  the  normal  case  of  multiple
   successful  message  transmissions  in  the  same session, since each
   successful message completion (i.e.,  server  receipt  of  DATA,  the
   message, CR LF . CR LF, and then sending a positive completion reply)
   results in terminating a mail transaction.  Clients SHOULD  NOT  send
   RSET  after  receipt  of  a  250 response after DATA and the message;
   servers MUST reset their states after sending that 250  response  and
   MUST  NOT  require  clients  to  send  RSET before the next MAIL FROM
   command

5.3. Behavior with unrecognized verbs.

   While it is not quite explicit, RFC-821 appears to expect that, if  a
   verb  is  not  recognized by the receiver, it will reject the command
   with a "permanent error", 5yz, code, presumably 500  (Syntax  error).
   Similarly,  it appears to specify that, if the sender receives such a
   code, it must either abandon the mail message (sending QUIT or  RSET,
   presumably)  or  do  something else involving the same or a different
   verb; it may not simply ignore the 5yz error code and pretend it  was
   a  2yz  (or 354) code.  This specification depends on that behavioral
   model.

   Consistent with RFC-821, we expect that existing  SMTP  servers  will
   reply  with  a  return code of 500 (Syntax error) when any unfamiliar
   verb is received.

   The material above should probably have made it  into  RFC-1123,  but
   some  of  the  issues -- particularly the fact that anyone could ever
   have believed that anything else (such as simply ignoring 5xx  codes)
   was   permitted--have   emerged   only   in   the   process  of  this
   investigation.  Nonetheless, this clarification  is  believed  to  be
   consistent with existing usage and implementations of SMTP.

5.4. Behaviour with eight-bit data

   RFC-821 together with RFC-822 is unambiguous in this respect.  Unless
   an  extension to RFC-821 is in force for the mail transaction, eight-
   bit data may not be sent. Period.

   This point just needs emphasising. It  is  present  in  the  original
   documents, but not spelled out.





Onions                    Expires Aug 30, 1995              [Page 3]





INTERNET DRAFT       How to be a Bad EMail Citizen     February 17, 1995


5.5. Error reports with eight-bit data

   Some implementations will return the original message as  part  of  a
   delivery  report. Care needs to be taken in this case that the reason
   for failure was that eight-bit data  was  present.  Otherwise  it  is
   possible to construct an illegal eight-bit message as an error report
   to an eight-bit message.

   As error reports and  messages  cannot  be  easily  distinguished  in
   RFC821,  all  messages  (including error messages) appear as standard
   messages, and therefore need to be correct RFC822 messages.

5.6. Rejection of SMTP connections due to DNS failure.

   There are a number of SMTP implementations that either do, or can  be
   configured,  to  reject  SMTP  connections if the calling host is not
   registered in the DNS. This is seen by some  as  a  breaking  of  the
   spirit  of  RFC-1123, and by others as a useful get-out-of-jail card.
   Regardless of whether this is a good idea or  a  bad  one,  the  fact
   remains  this  is practiced by some sites. Implementors are therefore
   encouraged to use back up MX routing in the case of a connection that
   succeeds but no data is received before the connection is dropped.

   This topic has been debated a number of times on  the  Internet  with
   both  sides  sticking to their views. There is no sense in continuing
   to try and standardise this point. What  a  site  will  do  with  any
   internet  connection  from any host eventually comes down to what the
   administrator at that site decides. If they don't want to talk  to  a
   given  set  of  hosts,  that  may  be their loss. With the increasing
   emphasis on security though, the fact that a site advertises an MX or
   A record in the DNS does not imply it will talk to all callers.

5.7. EHLO commands

   There are one or two servers that respond  badly  to  EHLO  commands.
   That  is they either set themselves into inconsistent states, or else
   drop the connection at once. The RFC is  fairly  clear  that  unknown
   commands should be rejected but otherwise ignored.

   A resilient server MAY detect that the EHLO caused the connection  to
   drop  and immediately retry the connection with a HELO verb in place.
   Alternatively it can be treated as a bad connection and subsequent MX
   records  tried  if  available.  However  servers  SHOULD NOT drop the
   connection in response to an unknown verb.







Onions                    Expires Aug 30, 1995              [Page 4]





INTERNET DRAFT       How to be a Bad EMail Citizen     February 17, 1995


6. RFC-822 Issues

6.1. Illegal format RFC-822 messages

   Some implementations of RFC-821 check the message  for  adherence  to
   RFC-822  minimum  requirements  as the message is received. These are
   that the message contains in the header a From field,  a  Date  field
   and  a  recipient  field  of some type. However, some user agents use
   RFC-821 as a submission protocol and assume  that  messages  will  be
   made  legal  RFC-822 as part of the submission process (as some MTA's
   already  do  this).  Implementations  MAY  therefore  allow  strictly
   illegal  RFC-822  messages as data and make them legal by addition of
   new headers, or MAY reject the message as illegal data.

   Some User Agents, particularly those on PC's  find  it  difficult  to
   determine  an  accurate  time  to provide a Date field, and therefore
   leave it out. It is harmless enough  to  insert  such  a  field  when
   acting  as a submission channel, but inserting a Date mid way through
   a multi-hop delivery path is mis-leading and should  be  discouraged.
   However,  in practice it is difficult to determine the two modes RFC-
   821 is  used  in,  so  usually  a  blanket  decision  concerning  all
   transfers  has  to  be  made. What is really required is a submission
   protocol tailored for this sort of behaviour that can take a  partial
   RFC-822 message and add the appropriate envelope bits.

6.2. Received Lines

   The syntax of the Received: lines in RFC-822 messages  is  reasonably
   straight  forward.  It requires as a minimum a date stamp following a
   semi-colon.  Unfortunately  some  implementations  cannot   seem   to
   generate  this.  This  can  cause  problems  when gatewaying to other
   systems that also have trace fields. This is seen as a  good  way  to
   cause general confusion when tracking messages.

   When gatewaying or examining these  elements,  the  invalid  elements
   should  either be discarded or else the current time inserted to make
   them legal. The illegal Received: lines can be changed  to  be  Orig-
   Received: to ensure the relayed message is now legal.

6.3. Date fields.

   Date  fields  are  usually  fairly  standard,  although   there   are
   implementations  that  strike out with new an novel formats. However,
   when it comes to the area of time zones there is little limitation in
   the  imagination  of  implementors.  Normally  time  zones  should be
   numeric as these are unambiguous. It should be down to the user agent
   to display the Date in a ``pretty'' format.




Onions                    Expires Aug 30, 1995              [Page 5]





INTERNET DRAFT       How to be a Bad EMail Citizen     February 17, 1995


   Just say NO to pretty, arbitrary timezones! All UAs  should  generate
   numeric offsets for timezones.

6.4. Resent- fields

   RFC-822 allows the pseudo-forwarding  of  messages  by  amending  the
   header of a message to contain new recipients. This is done by adding
   headers such as

      Resent-To: abc(_at_)domain(_dot_)name
      Resent-Date: Sun, 1 Jan 1995 02:24 +0000
      Resent-From: xyz(_at_)foo(_dot_)bar

   It is not clear in RFC-822 if when resending a message a complete set
   of  headers  is  required. The standard would seem to imply that they
   are  but  no  grammer  is  present  which  mandates   it.   Therefore
   implementations vary on how to treat this type of message.

   Strict implementations will on detection of a Resent- field, conclude
   that  this  is  a  resent  message, and therefore should be using the
   Resent- versions of the fields as opposed to the standard  forms.  In
   this  case  a  message  without  a  Resent-From,  a Resent-Date and a
   Resent- recipient field is illegal. It is assumed  that  the  message
   has been resent but with only a partially correct header.

   Other implementations take the view that a Resent- field is a  higher
   weighted form of the original field. That is, a Resent-Date should be
   used in preference to a Date field, but as long as a Date,  From  and
   Recipient  field  is  present  with or without the resent- prefix the
   message is legal.

   The first view treats the resent- as a new overriding SET of headers,
   the  second  as individual replacements for fields. Either case could
   be argued, as the original text is unclear.

   For pragmatic reasons, and because it seems closer to the  intent  of
   RFC-822  in  this  case, the Resent- fields should be taken as a set.
   However implementations  SHOULD  allow  the  individual  fields.   In
   practice  this  sort of forwarding is not very common, but does arise
   from time to time.

7. MIME issues.

   MIME since its inception has allowed implementations of MTAs and  UAs
   to  further  the  cause  of havoc and generally increase entropy. The
   number of ways that it is possible to get this specification wrong is
   truely  astounding!  In general an MTA can treat badly formatted MIME
   as a text/plain format and punt the whole problem to the UA.  The  UA



Onions                    Expires Aug 30, 1995              [Page 6]





INTERNET DRAFT       How to be a Bad EMail Citizen     February 17, 1995


   will take a number of views:

   a)   It will crash and burn.

   b)   It will complain the message is illegal and refuse to show it.

   c)   It won't care and show you the message, warts and all.

   d)   It will ignore the message, and you will  never  even  know  you
        have received the message.

   The best approach is to be able to flag an error and then  revert  to
   action c) above. This may upset some naive mail users (who seem to be
   predominantly upper management and therefore dangerous to upset!).

7.1. Badly formatted Content-Type: fields

   Implementations have been known to produce lines of the form

      MIME-version: 1.0
      Content-Type: text

   That is, a MIME type, without the mandatory subtype. This is  illegal
   as   a   MIME  header  and  means  the  content  may  be  subject  to
   misinterpretation.

   In these cases the most pragmatic case is to  treat  the  message  as
   text/plain,  regardless  of  what  the  Content-type  might indicate.
   However, outright rejection of the message is also  an  option.  (The
   author  feels a system that rejects every other such message may have
   merits in forcing systems to be upgraded.)

7.2. Multiple Content-Type fields

   Messages  may  contain  multiple   Content-Type   fields,   sometimes
   containing  contradictory  information.  Where  this happens this may
   again cause contents to  be  misrepresented,  or  misprocessed.   For
   instance:

      MIME-version: 1.0
      Content-Type: multipart/mixed; boundary="---"
      MIME-version: 1.0
      Content-Type: text/plain


   As for the badly formatted contents type. If two Content-Type  fields
   are  present,  and  contain  the  same  information, that case MAY be
   treated as just one Content-Type field.



Onions                    Expires Aug 30, 1995              [Page 7]





INTERNET DRAFT       How to be a Bad EMail Citizen     February 17, 1995


7.3. Badly structured multipart messages

   Message that contain fields such as

      Content-Type: multipart/mixed

   have some great potential for causing indigestion  in  mail  systems.
   The  missing boundary string means that although the message is split
   into multiple parts, there is no way a process  can  reconstruct  the
   message in general.

   It is charitable to believed that these type of  messages  start  out
   with  good  intentions, but loose their boundary markers somewhere in
   flight. Whilst an intelligent human can scan the body part  and  make
   an  educated  guess  at  what the separator is, this is not generally
   possible for a program.

7.4. Wrapped lines

   Another interesting  little  problem  is  where  a  UA,  or  MTA  has
   helpfully  wrapped the text of the field to improve readability. Some
   interesting examples are presented here.

      Content-Type: multipart/mixed; boundary="message
           -separator"
      Content-Type: multipart/mixed; boundary="abcdefghijklmno:
      boundary:fixed01"

   The first case is debateably correct input, although few MTA/UAs will
   be  able  to  reconstruct  the  correct separator. The second case is
   illegal, ambiguous and awkward to treat well.

   Why do  people  do  this!  The  road  to  hell  is  paved  with  good
   intentions.   In  both  cases  little  should  be  done  to  try  and
   reconstruct the message without human help.

7.5. MIME prologue and Epilogue text

   A number of systems and hand constructed messages put text  into  the
   prologue  and  epilogue  of MIME multipart messages. Whilst this is a
   neat trick for allowing non-mime UAs  to  inform  the  user  why  the
   message  appears  as  garbage,  the prologue/epilogue does not really
   exist as part of a  message.  Therefore  when  gatewaying  or  simply
   processing   such   messages,   these   components   may   disappear.
   Alternatively they may appear as new body parts after transformation.
   Therefore  whilst  you  can  do  it, don't be suprised if it fails to
   appear at the other end.




Onions                    Expires Aug 30, 1995              [Page 8]





INTERNET DRAFT       How to be a Bad EMail Citizen     February 17, 1995


8. Acknowledgements

   This document represents a collection of the experiences and hard-won
   battle  scars  from  a large community of people. All implementors of
   SMTP mail systems will have had some influence on this document.

   In particular there are a number of points taken from the  work  done
   in  the  smtp extensions working group. This document is a summary of
   some of the discussions, and other experiences. Some of this text  is
   taken from an earlier draft of the SMTP working group document.

9. Security Considerations

   Security considerations are not discussed in this memo.

10. Editor's  Address

   Julian Onions <j(_dot_)onions(_at_)nexor(_dot_)co(_dot_)uk>
   Nexor Ltd.
   PO Box 132,
   Nottingham, England.






























Onions                    Expires Aug 30, 1995              [Page 9]