Re: A proposal for a scientific approach to this question [was Re: I'm s

John Day wrote:


For any formal proofing system worth its dime, this can be 100% ruled out,
since the proofing system will emit a 100% bugfree implementation of the
spec in the programming language of your choice as a result/byproduct of the
formal proofing process.


C'mon. You don't really believe that do you?


Which one in particular.

The KIV-tool, which I was shown and explained in 1996 by a student doing
his master thesis with it on a module of our software, can output
source code at the end of its processing.  Code that is a guaranteed
bug-free implementation of the spec.  (which MUST NOT be confused
with the given spec providing the desired behaviour and being free
of unintended (and unthough-of) behaviour).

(While the tool's name appears to originate from the University of
 Karlsruhe (->Karlsruhe Interactive Verifier), and that student back
 then was from Karlsruhe, the current homepage of the tool appears
 to be at the University of Augsburg:

  http://www.informatik.uni-augsburg.de/lehrstuehle/swt/se/kiv/
)


Ever see a compiler with bugs?


Compiler bugs are comparatively rare, many of them are problems
of the optimizer (therefore avoidable) and many happen under
obscure situations that can be avoided by defensive programming
style.  Code generators often produce "defensive style code".
Using a robust and mature programming language, such as C89,
may also help to avoid compiler problems of fancy or complex
languagues or fancy features.

Look at the long list of defects of modern CPUs.  Apps programmers
that use a compiler rarely have to know (let alone take into account)
those CPU defects because most compiler code generators use only
a subset of CPU features anyway.


Who verified the verifier?  How do you know the verifier is bug free?


Now you must be kidding.

Are you suggesting that anyone using the programming language Prolog
would be gambling and could never expect deterministic results?


With a decent amount of expertise and the right mindset,
even mere mortal humans can produce code with a very low error rate.
IIRC, Donald E. Knuth didn't need a formal specification and a
formal proofing tool to achieve a low error rate for TeX and Metafont.

Having started assembler programming at the age of 12 myself, I could
easily understand why DEK used Assembler for his algorithm code examples
in TAOCP when I first encountered these books at age 21.


As I indicated before, I have been working on this problem since we 
discovered Telnet wasn't as good as we thought it was.  For data 
transfer protocols, it is relatively straightforward and can be 
considered solved for anyone who wants to bother.  The problem is 
most don't.


There is a perverted conditioning in our education system and
business life by rewarding mediocrity (cheating, improvising, being
superficial) which is IMO one of the underlying causes for
  http://en.wikipedia.org/wiki/Dunning_Kruger
and impairs many implementors' motivation to watch out and compensate for
defects in the spec and to recognize and respect design limits of a spec,
i.e. avoid underspecified protocol features or require official
clarification and adjust the code accordingly before shipping
implementations of underspecified protocol features.


The real problem is the so-called application protocols where dealing 
the different semantics of objects in different systems makes the 
problem very hard and very subtle.  The representation of the 
semantics is always of what you think its seminal properties are. 
This is not always obvious.



I believe that the real difficulty is about designing and discussing
at the abstract level and then performing formally correct transitions
from abstract spec level to concrete specification proposla level
and comparing implementation options at the spec level ... and getting
everyone that wants to participate to understand the abstraction
and how to perform transitions from and to concrete spec proposals.


I'm experiencing that difficulty myself every once in a while.
In 1995/1996 I had a adopted notion/bias for the uselessness of
traditional GSS-API channel bindings (as defined in GSS-APIv1 rfc1508/1509
and GSS-APIv2 rfc2743/2744) and it took me a few weeks of discussion
to really understand what Nico was trying to accomplish with
cryptographic channel bindings, such as tls-unique, its prerequisites
and real properties and its benefits.

But maybe you're not really thinking of a defect in the implementation
(commonly called "bug") but rather a "gap" in the specification that
leads to unintended or undesired behaviour of a 100% bug-free
implementation of your spec.


One person's gap is another person's bug.  What may be obvious to one 
as something that must occur may not be so to the other.


There is a significant difference between a bug in an implementation
and an unintended consequence due to a "gap" in the specification.

The SSLv3 protocol and the IETF successor TLS contains a somewhat
underspecified protocol option called renegotiation, and some apps folks
had the not very bright idea to abuse that protocol option and
made flawed assumptions about protocol properties that did not
actually exist.

Had those apps folks, rather than shortcutting on flawed assumptions,
simply collected the description of the renegotiation properties that
are sprinkeled throughout the TLS spec and put them into a single list,
and then thought about it for an hour, then the problem (the absence of
the property that they wanted to rely on) should have become obvious to them.

It became obvious for me as soon as I had put together that list and
thought about the consequences during discussion of TLS channel bindings.


The human mind has an amazing ability to convince us that how we see 
the world is the way others do.



The Dunnin-Kruger effect, "Routine", "habits", are all variations
of trainig a neural network and improving the pattern recognition
over time.  The strength of neural networks, to recognize also
variations of known patterns, is also its weakness, a bias/preference
to recognize stuff as something that is already known.
Fixing training errors and trying to add similar distinct patterns
very late usually result in a significant (re-)trainig effort --
and this is where the constant conditioning for cheating, improvising,
superficialness frequently kicks in.

"If all you have is a hammer, everything looks like a nail."

Technically, humans are capable of overcoming the Dunning-Kruger effect
and becoming aware of the limits/boundaries of their knowledge, but
many enjoy a happy life without ever trying to.


Having been on the Net when there were 15-20 very different machine
architectures, I can assure you that implementation strategies can
differ wildly.


I'm well aware that the amount of bad code out there outnumbers
the amount of good code.  While I don't think that differing implementation
approaches imply causality for interop problems, there certainly is
the problem of ad-hoc/poor/dangerous/error-prone implementation design
that causes plenty of bugs and interop problems and unexpected
behaviours, neither of which a result of a conscious decisions,
getting shipped into the installed base.

Good specifications are sufficiently clear and precise to enable
smooth interop, yet sufficiently abstract to provide a high level
of implementation freedom as well as future extensibility.

What is much more of a problem is the common lack in recognizing
and respecting limits of a design/specification, and a lack of
compensating for relatively easy to recognize spec defects.
Relatively easy to recognize for a careful implementer, that is,
not necessarily easy to recognize for a mere reviewer.


Another problem is maintenance. Protocols change. Having to maintain a
formal specification is commonly at least an order of magnitude more
effort than maintaining a prose description.


Definitely.  Discussing protocols at the level of formal specification
language would be quite challenging (typically impossible in an open
forum, I believe it is even be difficult in a small and well-trained
design team).


Which means there will always be a danger of "lost in translation."


The danger is real.  If there is a lack of decent leadership,
the resulting spec (defects) may sometimes pose a significant
challenge to implementations&interop.

For consumers of the spec outside of the SDO, the option to
ask for clarification/fix might be non-obvious -- and regularly
is non-trivial for someone who has not previously participated
in the particular SDO (or any SDO at all) -- which is actually
quite common.  There are easily two magnitudes more implementors
of specs distributed by the IETF than there are individuals who have
participated in the IETF over the past 30 years.


 And, as has been mentioned before, I'd like to emphasize that the IETF
 experience and principal is that, if you want interoperation,
 compliance testing is useless.


Ouch.  I believe this message is misleading or wrong.

Compliance testing is VERY important, rather than useless,

Compliance testing would actually be perfectly sufficient iff the spec
was formally proven to be free of conflicts and ambiguities among
specified protocol elements -- plus a significant effort spent on
ensuring there were no "gaps" in the specification.


Do you realize the cost of this amount of testing? (Not saying it 
isn't a good idea, just the probability of convincing a development 
manager to do it is pretty slim.)  ;-)


I fail to follow.  Compliance testing does not necessarily require
code or a test tool.  Often, compliance will be as simple as actually
_reading_and_following_ the spec, how to process some PDU element.
And compliance testing may be as simple as a code review of less than
an hour and cross-checking a handful MUST and MUST NOTs.

Or what to do with a protocol option/feature one's own implementation
does not recognize, and then _do_what_the_spec_says_
rather than doing what a number of implementations are doing: regularly
aborting with a fatal error rather than doing what the spec says.

Example: the optional TLS protocol extension "Server Name Indication",
first specified here:  http://tools.ietf.org/html/rfc3546#section-3.1


   "HostName" contains the fully qualified DNS hostname of the server,
   as understood by the client. 

Doing what the spec says with respect to this requirement:

                                                                     If
   the server only needs to match the HostName against names containing
   exclusively ASCII characters, it MUST compare ASCII names case-
   insensitively.

If that is not sufficiently clear for an average implemetor to realize
that a case-sensitive comparison (strcmp,strncmp) will be incompliant
with the spec and a case-insensitive comparison (strcasecmp,strncasecmp,
_stricmp,_strnicmp) will have to be used instead, then we might have to
give up and close down the IETF SDO.


According to http://en.wikipedia.org/wiki/Server_Name_Indication#Support
support for this protocol option was originally designed in 2004,
and added into the main codeline of OpenSSL 0.9.8 in 2007.

However, OpenSSL-1.0.1c today still does a case-sensitive comparison...


 The way to interoperation is interoperation testing between
 implementations and, to a first approximation, the more and the
 earlier you do interoperation testing, the better.


The #1 reason for interop problems and road-block to evolution of
protocols is the wide-spread lack of compliance testing (on the flawed
assumption that it is useless) and focus on black-box interop testing
of the intersection of two subsets of protocol features.

The problem with interop testing is that it really doesn't provide
much of a clue, and the results can _not_ be extrapolated to features
and areas that were not interop tested (typically the other 90% of the
specification, many optional features and most of the extensibility).


The near complete breakage of TLS protocol version negotiation is a
result of the narrow-minded interop testing in companion with a complete
lack of compliance testing.


Being aware of the breakage of the TLS protocol version negotiation
within the installed base in 2009, an explicit reminder was added
into rfc5746 that processing TLS protocol version numbers correctly
in the TLS handshake was vitally important and has always been a
requirement in the TLS protocol spec.

   TLS servers implementing this specification MUST ignore any unknown
   extensions offered by the client and they MUST accept version numbers
   higher than their highest version number and negotiate the highest
   common version.  These two requirements reiterate preexisting
   requirements in RFC 5246 and are merely stated here in the interest
   of forward compatibility.

The necessity for the TLS renegotiation info extension had become
painfully obvious after it had been realized that the two TLS connection
states involved in a TLS renegotiation had been completely independent,
enabling an MitM attacker to inject/prefix data of his choice at the
beginning of a TLS session, and the document is very explicit which
precise state information of the previous/enclosing session is carried
over into the successor TLS session and how.

Curiously, one vendor happened to actually break the TLS protocol
version negotiation much worse than it already was, by performing
a check on the client_version in the premaster secret of the
renegotiation ClientKeyExchange message not against the client_version
from the ClientHello from the very same TLS handshake, but instead
check against the client_version of the initial handshake, that had
been found to have painfully independent state and created the
necessity for the rfc5746 fix in the first place...

Simply following what the spec says during implementation would have
avoided the problem.  Performing a code review of the reminded
TLS protocol version negotiation would have noticed that breakage.
Interop testing the protocol version negotiation in a meaningful
fashion, even with earlier TLS implementations from the very same
vendor would have clearly shown the breakage prior to shipment.
As it turns out, the vendor failed on all three accounts,
and this _new_ breakage got shipped...

If done right, pure compliance testing can go a long way to providing
interop.  The only area where real interop testing is necessary is
those protocol areas where the spec contains conflicts/inconsistencies
that implementors didn't notice or ignored, or where there is a
widespread inconsistencies of implementations with the specification,
of created by careless premature shipping of a defective implementation
and factual creation of an installed base that is too large to ignore.


That last sentence is a doozy.  You are saying that interop testing 
will be required with bad implementations with sufficiently large 
installed base that you have to live with it?



It seems to be a recurring event that one or more vendors create
an installed base that is clearly non-compliant with the original
spec, but too large to ignore, and as a result, everyelse adapts
to that behaviour for the sake of interoperability with that
installed base.

e.g.  http://tools.ietf.org/html/rfc4178#appendix-C
      http://tools.ietf.org/html/rfc5929#section-8

There may be others, but I'm actively tracking only very few
topics in the IETF, so these two are ones I know from the top
of my head.


-Martin

Re: A proposal for a scientific approach to this question [was Re: I'm struggling with 2219 language again]