Re: Blog: YANG Really Takes Off in the Industry

Please, it is not a question of whether YANG is a good or a bad data
model. Whether it is applicable to a problem depends on what that
problem is.

There are two sorts of (useful) schema language for two different problems.

If your problem is modeling existing data formats then you need an
expressive schema language that allows the irregularities and
inconsistencies of the existing formats to be described. XML Schema is
horribly complex because one of the constraints was that it should be
a superset of SGML DTDs to allow automatic translation from DTDs to
Schema.

I have schemas that describe DNS and RFC822 headers and the like and
while they are fairly straightforward, they are not 'clean'. There are
six different ways to encode headers in RFC822 style and DNS RRs are
much more complicated.

So if you are trying to model IP headers or the like you are going to
want an expressive language.


If what you want to do is to define an application protocol that does
not need to be backwards compatible with a legacy system then you want
as little flexibility as possible. Because standards writing is taking
decisions that don't matter out of the equation.

This is why ASN.1 is so unfit for purpose. Despite the claim to be
'abstract', the data model is full of design choices that only exist
to micromanage the wire format. The distinction between sets and
lists, implicit versus explicit tagging etc. etc.

At the end of the day, every message in an application protocol is
going to map to an API call or return somewhere in the application
implementation. We can make that mapping complicated or we can make it
simple.

If we have a protocol TimeService with method for finding the time:

class TimeService {
    public Status GetTime (out DateTime Now);
    }

We can have a schema language that allows fifty different encodings
for the DateTime object or we can pick one and stick to it. If we are
using a JSON encoding which is text, we would probably map DateTime
onto a string and use ISO 8601 or RFC 3339 to describe the encoding.

If we were dealing with legacy we might have to deal with other forms.
But the best schema language for writing new apps would not provide
that flexibility.

{GetTimeResult : {Now: "2014-01-02T13:10:00Z"}}

Now the reason I chose this particular example is that time is
actually a case where encoding does matter because embedded devices
might not know about the leap seconds which some idiots have managed
to visit on us:

class TimeService {
    public Status GetTime (out DateTime Now, out Int64? ElapsedSeconds);
    }

Here we do have a schema issue since JSON integers can be of any size
while using bignum arithmetic in an API has a severe impact on
performance and convenience. And in this case we will have bigger
problems long before the Int64 time rollover occurs.

Now since the C# syntax for a nullable type (Int64?) doesn't make the
message signature obvious, I actually use my own private schema
language:

Message GetTimeResult
    DateTime Now
        Required
    Integer ElapsedSeconds
        Bits 64


YANG is a lot more powerful and flexible than my system. And that is
why I would not use it for the type of applications I work on. I do
not want to have the option of encoding the data in more than one way
because in a new application that is 'bikeshedding'.

If your problem is dealing with legacy systems and requires modeling
of statefull interactions then you need that flexibility.


In summary, there is a place for YANG but it is certainly not the only
schema language that is useful. There is also a place for XML Schema
but that place is not describing Web Service protocols. There is also
a place for simpler schema formats that allow a direct mapping of API
calls to messages. And there are some schema languages that should
never be used at all.