Be conservative in what you do, be liberal in what you accept from other



I'm going to take this opportunity to rant about design, specification
and implementation philosophies.  It's a long rant, but I think it is
important. 


In <200403232332(_dot_)49748(_dot_)tim(_at_)schmerg(_dot_)com> Tim Meadowcroft 
<tim(_at_)schmerg(_dot_)com> writes:

The quote and philosophy goes back to TCP (RFC 793 - "be conservative in what 
you do, be liberal in what you accept from others") [...]

[list of interesting and useful links snipped]


Another good RFC to read on this subject is RFC1122 "Requirements for
Internet Hosts -- Communication Layers", in particular section 1.2

See: http://www.ietf.org/rfc/rfc1122.txt?number=1122

The line that makes sense (to me) is that _specifications_ should be
as strict as possible (but without limiting future extensions), but
_implementations_ should be a little more forgiving.


This is my understanding also, but it is important to note that by
"more forgiving", that means that you should expect out-of-spec input
and your system should not croak or give invalid results.  Systems
should be allowed to send you invalid data (liberal accept) and you
should not break or do bad things (conservative do).


At this point, let me quote section 1.2.2 of RFC1122.  (The other
sections in 1.2 are worth reading, so still visit the above link)

      1.2.2  Robustness Principle

         At every layer of the protocols, there is a general rule whose
         application can lead to enormous benefits in robustness and
         interoperability [IP:1]:

                "Be liberal in what you accept, and
                 conservative in what you send"

         Software should be written to deal with every conceivable
         error, no matter how unlikely; sooner or later a packet will
         come in with that particular combination of errors and
         attributes, and unless the software is prepared, chaos can
         ensue.  In general, it is best to assume that the network is
         filled with malevolent entities that will send in packets
         designed to have the worst possible effect.  This assumption
         will lead to suitable protective design, although the most
         serious problems in the Internet have been caused by
         unenvisaged mechanisms triggered by low-probability events;
         mere human malice would never have taken so devious a course!

         Adaptability to change must be designed into all levels of
         Internet host software.  As a simple example, consider a
         protocol specification that contains an enumeration of values
         for a particular header field -- e.g., a type field, a port
         number, or an error code; this enumeration must be assumed to
         be incomplete.  Thus, if a protocol specification defines four
         possible error codes, the software must not break when a fifth
         code shows up.  An undefined code might be logged (see below),
         but it must not cause a failure.

         The second part of the principle is almost as important:
         software on other hosts may contain deficiencies that make it
         unwise to exploit legal but obscure protocol features.  It is
         unwise to stray far from the obvious and simple, lest untoward
         effects result elsewhere.  A corollary of this is "watch out
         for misbehaving hosts"; host software should be prepared, not
         just to survive other misbehaving hosts, but also to cooperate
         to limit the amount of disruption such hosts can cause to the
         shared communication facility.


One of the links that Tim Meadowcroft posted talked about the
differences between HTML, which allows very sloppy input and browsers
generally do something reasonably useful with it, and the JVM classes,
which requires strict input and if something is invalid, the class is
rejected.

The difference between these two environments, in part, is the
ramifications of what the program does.

If you feed bad HTML to a browser, the browser could throw up it's
hands and say "bad HTML", or it could try to render the page anyway.
Usually the web page won't look exactly like what the HTML author
intended, sometimes the results will be very ugly, but still readable,
and every once and a while, the output will totally unusable.

If you feed bad classes to a JVM, the JVM could throw up it's hands
and say "bad class", or it could try to execute the class anyway.
Usually the program won't do exactly what the Java programmer
intended, sometimes the results will be bogus, and every once and a
while, the program will cause a security violation.

In both the HTML and Java cases, the programs are liberal in what they
accept in that they do no croak on bad input.  In both cases, the
programs try to be conservative with what the do.  In the case of a
web browser, the conservative thing is to try to continue on, in the
case of Java, the conservative thing is to immediately reject the
class.


Another important philosophy of robust software design is that earlier
an error can be detected, the better.  A corollary is that it is
better to consistently trigger an error than only (seemingly) randomly
have the error show up.  Another corollary is that if you fail, you
shouldn't fail silently.


Ok, back to SPF, it's design, specifications, and implementations.

Giving bogus results from SPF can cause important email to be lost.
This means that we must be extremely careful to not give bogus
results.  We should always try to be failsafe.  Fortunately, SPF has
an easy way to be failsafe: it can return "unknown".


Let's consider the idea of checking the HELO string.  Is it better to
always check the HELO string, or is it better to check the HELO string
only when we have "MAIL FROM:<>"?

I say that it is better to always check the HELO string because it
gives earlier and more consistent errors.  Most messages will not be
bounces (MAIL FROM:<>), so if the HELO string would fail, it will only
fail (seemingly) randomly.  Moreover, when the MAIL FROM is null,
there is no way to send back a bounce to the original sender about the
error.  Yes, a log message might show up somewhere, but that is
unlikely to be noticed.

Now, there is the separate issue about whether SPF should be checking
the HELO string at all.  But, if SPF does check it, SPF should always
check it.


Now let's consider the case of the following SPF record:
        v=spf1 a mx mx;smtp.%{d} -all

This SPF record contains a syntax error, someone put a semi-colon
where a colon is required.  Either that, or it is an unknown mechanism
with an invalid character.  Now, if an SPF implementation just
evaluates the SPF record from left to right, stopping when it has an
answer, much of the time it will stop after either the "a" or the
"mx", but sometimes it will find the syntax error.  This SPF record
will never cause a result of "fail", even though it is clearly the
intent of the domain owner that most IP addresses should be denied.

I say it is better to always detect the syntax error, and give
consistent results of "unknown", even if the "a" or "mx" mechanisms
would succeed.  



Now let's consider the case of this SPF record:
        v=spf1 a mx mx:smtp.%{d} -all explanation=spf.%{d}

Here, the domain owner used the wrong modifier, but otherwise it is
syntactically correct.  Since modifiers do not change the result of
the SPF check, an invalid or unknown modifier is much safer to
overlook.  The result might not be the explanation string that the
author wanted, but the results will likely still be useful.  This is
more like the web browser case than the JVM case.


Finally, let's consider this SPF record:
        v=spf1 a mx:com -all

Now, maybe someday the com TLD will have an SPF record, and there
certainly isn't anything syntactically wrong with this record, but
most likely this is not what the author intended.  Both Wechsler's PHP
implementation of SPF on infinitepenguins.net and libspf-alt (v0.3)
will detect this and give a warning, but the SPF check will still
return either pass or fail.



I realize that there is a lot of room for reasonable people to
disagree about what is "the conservative thing to do".  I also realize
that it takes a lot more work to strictly check the input for any and
all errors and then decide what the right thing to do if an error is
found.  However, with email being so important and the need for it to
be reliable, I think we should go the JVM route rather than the web
browser route.



-wayne

Be conservative in what you do, be liberal in what you accept from others