RE: Best practice for data encoding?

From: Jeffrey Hutzelman [mailto:jhutz(_at_)cmu(_dot_)edu]

To be pedantic, ASN.1 is what its name says it is - a notation.
The properties you go on to describe are those of BER; other 
encodings have other properties.  For example, DER adds 
constraints such that there are no longer multiple ways to 
encode the same thing.  Besides simplifying implementations,


Hate to bust your bubble here but DER encoding is vastly more complex than any 
other encoding. It is certainly not simpler than the BER encoding.

The reason for this is that in DER encoding each chunck of data is encoded 
using the definite length encoding in which each data structure is preceded by 
a length descriptor. In addition to being much more troublesome to decode than 
a simple end of structure market such as ), }, or </> it is considerably more 
complex to code because the length descriptor is itself a variable length 
integer.

The upshot of this is that it is impossible to write a LR(1) encoder for DER 
encoding. In order to encode the structure you have to recursively size each 
substructure before the first byte of the enclosing structure can be emitted.

this also makes it possible to compare cryptographic hashes 
of DER-encoded data; X.509 and Kerberos both take advantage 
of this property.


I am not aware of any X.509 system that relies on this property. If there is 
such a system they certainly are not making use of the ability to reduce a DER 
encoded structure to X.500 data and reassemble it. Almost none of the PKIX 
applications have done this properly until recently.

X.509 certs are exchanged as opaque binary blobs by all rational applications.

Then there are MACRO definitions, VALUE specifications, and an even 
more complex definition of extension capabilities. In

short, ASN.1 is

vastly more complex that the average TLV encoding. The

higher rate of

errors is thus not entirely surprising.


There certainly is a rich set of features (read: complexity) 
in both the
ASN.1 syntax and its commonly-used encodings.  However, I 
don't think that's the real source of the problem.  There 
seem to be a lot of ad-hoc
ASN.1 decoders out there that people have written as part of 
some other protocol, instead of using an off-the-shelf 
compiler/encoder/decoder;


That's because most of the off the shelf compiler/encoders have historically 
been trash.

Where do you think all the bungled DER implementations came from?

I also suspect that a number of the problems found have 
nothing to do with decoding ASN.1 specifically, and would 
have come up had other approaches been used.  For example, 
several of the problems cited earlier were buffer overflows 
found in code written well before the true impact of that 
problem was well understood.


Before the 1960s? I very much doubt it.

_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf