Re: Appeal from Phillip Hallam-Baker on the publication of RFC 7049 on t

On Thu, Feb 20, 2014 at 2:22 AM, Eliot Lear <lear(_at_)cisco(_dot_)com> wrote:

<no hat>
On 2/20/14, 2:28 AM, Mark Nottingham wrote:

On 20 Feb 2014, at 11:37 am, Phillip Hallam-Baker 
<hallam(_at_)gmail(_dot_)com>

wrote:

My main concern is the process question. I really don't care whether

CBOR is a PROPOSED STANDARD or whatever. What I do care about is if I am
told that I have to use it because that is the IETF standard for binary
encoding. And what I care most about is the risk that this approach of 'its
our ball and only we will decide who gets to play' is going to be repeated.

I have to agree with Phillip on this point, and I hope the answer is

uncontroversial -- that just by virtue of being an IETF standard, we don't
start requiring people to use something already defined if their use case
is vaguely similar.


When something is a standard, it means you need to use it in the way

specified; it doesn't mean you have to choose to use it, even in other
standards.

Yeah, we're off the rails here, and it's becoming a bad habit.  People
seem to like playing "what if" games about how bad things can get if
everyone loses their heads.  WGs and spec developers should always use
what makes sense (standard or no).  Rough consensus and running code,
thank you very much.


Oddly enough, compact formats for protocol encodings are one specification
that never need to be implemented to have value.

Back in the XKMS vs SCVP days a journalist tried to compare the two specs.
Since all they understood was the message size, that was the basis for
comparison. An SCVP message is maybe 1KB and an XKMS message might be as
much as 3KB (I am guessing here).

Given that both easily fit into an IP packet and they involve public key
cryptography, the size of the message would be completely irrelevant even
if the two protocols were equivalent (which they are not). But the story
was taken up and used as FUD.

If we had has a compact encoding for XML, we could have easily defeated the
FUD by pointing out that people who care can use the efficient encoding
option and the 'problem' goes away.


The issue can also come up at the design stage. Earlier this morning
someone proposed the following as an example of a JSON microformat:

microformat GPSLocation;
   // A GPSLocation is a pair of comma separated floating-point
   // numbers representing longitude and latitude.
   // e.g. "location": "0.0,51.5"

But this is not using JSON encoding at all, it is a string with two decimal
fractions separated by a coma. The JSON encoding would be:

"location" : {"X":0.0,"Y":51.5}

The justification for the microformat is of course byteshaving. And of
course the example being artificial is unrepresentative. In the normal case
the comparison would be

"210.012345,51.52232" vs {"X" : 210.012345,"Y" : 51.52232}

The difference is 3 control bytes versus 9 for the tagged version. Which
isn't actually a lot of overhead for a tagged text format.


But imagine we have an efficient binary encoding that can represent those
numbers in 5 bytes rather than 10. The efficient binary encoding is much
more efficient than the handcoded binary.

Having a compact binary notation about provides a tool that can be swung
when someone is messing about with a microformat that is only going to
cause grief. If a specification is using JSON encoding then it should use
JSON encoding, not JSON plus a bunch of poorly thought out ad hoc hacks
that seemed like a good idea at the time.

When people reach for regular expressions I reach for the sick bag.

Regular expressions are a very powerful tool that can be used to create
more complexity in fewer lines of code than any other language I know
(including APL). They are like using GOTOs. Every non-trivial program
without exception has GOTO statements but in a structured programing
language the programmer does not code them directly or need to.

So one of the challenges in using XML or JSON or any other regular data
encoding in a standards effort is to block attempts to sneak in other
encodings by way of 'microformats' that always looks such a good idea till
they have to be coded.


Having a binary encoding for the data encoding at hand allows the
microformats to the swatted away. Doing better than a text encoding is
pretty easy. Doing better than a well designed binary encoding is actually
hard.


-- 
Website: http://hallambaker.com/

Re: Appeal from Phillip Hallam-Baker on the publication of RFC 7049 on the Standards Track