Re: RFC *821: failure versus error




--On Friday, 02 September, 2005 09:15 -0700 Claus Assmann
<ietf-smtp(_at_)esmtp(_dot_)org> wrote:

What is the difference between "failure" and "error" replies
with respect to RFC *821?

821 explicitly uses E: and F: replies in
4.3.  SEQUENCING OF COMMANDS AND REPLIES

1869 has two subsections about "failure" versus "error":
4.4.  Failure response
4.5.  Error responses from extended servers

2821 uses only E: in
4.3.2 Command-Reply Sequences
...
   success, and "E" for error.  Since some servers may
generate other


I am reconstructing this from a fading set of memories, but, if
those memories are correct, Jon tried to make a distinction in
821, partially left over from its predecessors.  I believe the
intent was to distinguish between:

        "error": the client did something wrong that the server
        detected.  Syntax errors in commands, commands out of
        sequence, invalid commands, and so on would be "errors".
        
        "failure": the client behaved according to the
        specifications, but the server was unable to deliver the
        mail.  "Mailbox not found" and all of the temporary
        conditions such as "mailbox full" were, in that sense,
        "failures".

By the time we got started on 2821, it was no longer clear that
the distinction was useful.  I vaguely remember a conversation
with Jon on the subject, but can't recall what we concluded.
That might imply that we talked about talking about it and never
got around to it, or that we didn't reach any firm conclusion.
In any event, I think we (myself and DRUMS) concluded that the
distinction should be removed in 2821 (see below).  

I can't remember whether any distinction was intended in 1869.
But I suspect it was not: the creation of RFC 1425 involved a
fairly late rewrite of some earlier text to reduce its apparent
complexity.  The terminology of the predecessor versions was
probably inconsistent without intending any specific
distinction.  What 1425->1869 did do was to introduce a new set
of "error" cases.  With 821, sending a command not listed in the
spec or syntax that was not specified there were clearly
"errors", with no need for context (at least absent prior,
out-of-band, agreement between the client and server).  1425
introduced context and more state, in the sense that some
commands and syntax became valid iff the corresponding keyword
appeared in  the EHLO response list.

Unfortunately, 2821 is a patchwork of earlier text (from 821,
974, 1123, and 1869) with some new text and explanations.  Not
only did the WG (and I) lack energy for a complete rewrite, but
many of us feared that such a rewrite would introduce
unintentional and hard-to-find changes.   But that patchwork
effort, unsurprisingly, didn't result in completely consistent
terminology.   As far as I'm concerned, 2821bis is our
opportunity to smooth out whatever rough edges, especially those
of terminology, that people can identify and think are work
fixing.

So, given this question, it seems to me that this should be
rationalized.  It would be good if this list could reach a
conclusion as to whether I should try to find and remove the
remaining uses of "failure" or whether it would be worth it to
try to reestablish the original distinction.  Of course, the
latter would require agreeing on what that distinction was: the
fact that Valdis reached a different conclusion than the one
above indicates to me that agreement might not be trivial.   

Another possibility would be for me to simply insert a sentence,
presumably in the Terminology section, indicating that the two
terms are equivalent.  That would require almost no effort and
would involve no risk of inadvertently creating even more
confusion.  But it would increase the degree to which 2821
requires careful and complete reading, and there is probably too
much of that already.

but still refers to "failure" and "error" replies in the text.
For example, it even explicitly distinguishes between those
two:

4.1.1.1  Extended HELLO (EHLO) or HELLO (HELO)
...
   will give a successful response, a failure response, or an
error    response.  If the SMTP server, in violation of this
specification,

which is probably a leftover from 1869.


Most likely.  That text also appears in 1425 at the beginning of
Section 4 -- modulo some inserted text and renumbering, 1425 and
1869 are almost identical.   In 1425, the distinction is pulled
out into two subsections: 4.2 for failure responses and 4.3 for
error responses.  Section 4 of RFC 1425 is exclusively about
EHLO.  The "failure" case of Section 4.2 is consistent with my
explanation above: it occurs if the server gets an EHLO command
it recognizes as valid but, for some reason, can't return a list
of service extensions.  But the "error" cases of Section 4.3 are
not completely consistent with that explanation, since they
include both syntax errors and a series of things, like pending
connection shutdowns, that the client could not predict.  

In 1425 and 1869, the specified client responses to the failure
and error cases are identical, reinforcing my impression that
the distinction is hardly worth making.  One of my co-authors
might remember why we tried to make it; I don't.

If there is some (functional?) difference between "failure" and
"error" replies then this should be explained. Otherwise it
seems to be better to use only one type in the text for
consistency (either only "failure" or only "error") and to
avoid possible confusion.


There does not appear to be any functional difference any more,
if ever there was.  As indicated above, comments about how to
best fix this would be welcome -- something should clearly be
done.

    john