[note: I'm not on this list, and I probably don't care enough about OPES
to follow discussion on the list. (I have read this thread in the
list archives, but I don't have the broader context). I have done
some thinking about presentation encodings in protocols, and someone who
knew of that work forwarded this to me and suggested that I follow up.]
Regarding binary vs. text encoding: There are really three important
aspects to this question -
1. the way records are delimited
2. the ability to cope with a transmission channel that isn't
transparent.
3. the ability of humans to read the protocol without specialized tools.
Consideration #2: text is really useful if you're trying to run a
protocol over some antiquated communications channel (like SMTP or
BITNET in the case of MIME) that is transparent to text but not to
arbitrary binary values. It's not clear how useful it is in the context
of a new protocol.
Consideration #3: text is is really useful if the human-readable
protocol elements are ASCII-only, and you're running the protocol over a
bare TCP stream. OTOH if you're trying to support I18N (which you pretty
much have to do these days), or if you want to run over an encrypted
channel, it's not clear how useful it is for the protocol elements to be
ASCII. But I don't want to discount this too much - being able to
diagnose protocol problems without specialized tools really can be
useful.
This leaves consideration #1 - how records are delimited. You have
this consideration regardless of whether you use binary or text, and
it's pretty much the same question either way. Actually I'd say that
how you delimit records is the fundamental question, not whether you
use text or binary. You basically have two choices: length counts or
end-of-record delimiters.
End-of-record delimeters are attractive in that you don't have to know
the length of a record in advance before you start writing it - but they
do have some disadvantages: you don't know the length of a record before
you start reading it either, and if you're going to want the ability
to transmit arbitrary octet values within a record then you need some
kind of quoting mechanism, which introduces more complexity. Once you
have that quoting mechanism you can't use ordinary printf statements (or
whatever) to emit protocol bits.
Length counts make transparency easy, but might be unattractive if some
records will be so large that you don't want to buffer the whole record
before transmitting any of it.
If you try to have both a length count and an end-of-record delimiter,
you immediately need to ask what happens when the two get out-of-sync.
Which one wins? Does one win in one implementation and another one win
in a different implementation? (and do you want inconsistent behavior
across different implementations?) Does a single broken length count
result in failure to parse subsequent records? Is the additional
complexity required to maintain both worth the benefit?
other issues:
Typing
How many data types for protocol elements do you need? Do you want to
coerce everything into "text", or do you want to allow binary integers
also? Do you need multiple sizes of integers? Unsigned and signed?
Floating point? Special types for things like dates?
(in the case of 822 messages, the vast majority of the attributes are
strings, so expecting everything to be text on the wire is not a huge
problem. that might or might not be the case for your protocol.)
The fewer types your presentation layer supports, the more you'll need
to explicitly convert between native types and protocol types. OTOH the
more types your presentation layer supports the greater the potential
for silly states and type mismatches. (what happens when you need to
store UTF-8 into a string that's supposed to be ASCII?)
Regularity
It's really useful if the decoder (encoder) don't need to have specific
knowledge of the particular protocol elements they're reading (writing).
This is one of the big problems with rfc[2]822/MIME/HTTP/etc headers-
the field delimiters are uniform, but each field has a potentially
different syntax with different delimiters, different rules for
transparency, etc. (The 822 header structure was adequate to describe
simple plain-text messages, but it's not so good for complex message
structures with lots of descriptive metadata.)
Extensibility
Sometimes it's really useful if you can add additional protocol elements
to a record (say to extend a protocol) without resulting in an
incompatible record structure. (822 headers are extensible in that you
can add new fields without changing the meaning of existing fields;
however, it's hard to add new data elements within a field.)
Opacity
If some of your protocol engines need to pass data from one peer to
another without examining it themselves, it's useful if the protocol can
treat that chunk of data as "opaque" - merely copying it from one peer
to the other without decoding and re-encoding it (and potentially
changing its representation). Also, if an inner protocol element is
malformed, it's useful it this doesn't break parsing of the outer
protocol element.
Similarly, if you have protocol elements that are going to be subjected
to digital signatures and/or integrity checks, it's useful if the
application can treat those protocol elements as 'opaque' for the
purpose of signing/verification and not always have to deal with them in
decoded form. (this has been difficult in 822, since there's no clear
distinction between things that are changable in transit and things that
are not)
Mapping between internal and external representation
It is useful if there is a good impedance match between internal
(in-memory) and external (on the wire) representation of data elements.
For instance, if the presentation layer supports arbitrary-length
integers, this is not easily handled by programming languages that
assume fixed-length integers. Or if the programming language insists
that character strings be in unicode (so that comparisons with string
and character constants work) but the presentation layer doesn't specify
a charset.
Also, any time there is a need to map complex data structures from (to)
a format where variable-length data elements are located by sequential
scanning (e.g. 822 headers, XDR, BER, etc.) to (from) a format where
variable-length data elements are located by following pointers (typical
in-memory representation), there can be a number of efficiency losses.
There is also what I would call "reblocking inefficiency" - if you have
to copy or transform data from one layer in order to use it in another
layer, that slows things down. An example would be having to copy
multiple lines of an 822 header into a string representing a single
field, then you had to parse that field into individual sub-fields,
then to decode individual sub-fields (like an encoded-word or a
domain name encoded per IDN), etc.
Familiarity and mindshare
Any new bit of technology imposes a learning curve, and many people
naturally prefer immediately starting work with familiar tools, to
learning new tools. (I'm certainly guilty: I still do much of my
programing in C; the computational linear algebra people I work with
still do lots of theirs in FORTRAN.)
822/MIME/HTTP headers are familiar, but they are also fairly irregular.
I have written a lot of C code written to handle them- routines to parse
dates, address lists (with comments), content-type fields,
content-disposition fields, encoded-words, addresses, etc. IMHO, their
apparent simplicity is somewhat of an illusion. Another problem
with having 822 headers appear so simple is that syntax errors are
fairly common.
---
From reading recent messages on the list there seems to be a bit of
support for using 822-style headers, presumably due to familiarity and
mindshare considerations. If the WG does decide to go this route I
encourage it to define a single syntax which is shared by all fields,
and which provides adequate nesting, etc. for your protocol's needs,
while leaving some room for extensibility. (Offhand I'd recommend
something resembling LISP expressions.)
However you may find that when you actually think about the whole
protocol that the degree of familiarity and mindshare benefit isn't as
much as you previously thought.
And if you want to consider a reasonably-complete non-text alternative,
you might take a look at BLOB:
http://www.cs.utk.edu/~moore/draft-moore-rescap-blob-02.txt
BLOB certainly wasn't designed for OPES, but it was designed as a
representation to store metadata associated with network-accessible
resources. And it tries to deal with several of the considerations
above. In particular, the amount of code required to encode or decode
BLOBs is quite small. The tradeoff is that programs that use BLOBs
manipulate a generic data structure rather than one which is specific to
the PDU in use. Still, I've written a lot of code to deal with 822
message headers, and I find BLOBs much easier to deal with.
Thanks for reading,
Keith