Re: (out of the blue) OCP header encoding issues



On Tue, 20 May 2003, Keith Moore wrote:

the first thing I notice is that there's no provision for structure
within a 'value'.  my guess is, sooner or later, you'll need it. if
rfc 822 had had a uniform way of representing lists of things in
message headers, (especially if those 'things' could themselves be
lists), we probably would not have ended up with such a baroque
assortment of header field syntaxes today.


I agree that we need to add value "types" of some sort. On the other
hand, our current needs are limited to integers and strings, which are
already supported by the current syntax. It is possible, though not
yet known, that the current syntax will support any value type we will
need. In other words, all our types will be simple atoms.

As for lists of values and such, we will need to decide whether

        named-parameter = name ":" SP value

needs to be replaced with more flexible

        named-parameter = name ":" 1*(SP value)

or whether repeated parameter names (for value "lists") and different
parameter names (for value "structures") should be used instead.

I don't think the Hollerith constants help much :)  if they're
short, you probably don't benefit much from the count; if they're
potentially long (say more than 1000 bytes), the whole use of record
terminators needs to be re-thought.


The idea behind our use of Hollerith constants or NetStrings in
parameters is to avoid octet stuffing. Record delimiters do not help
if your parameter contains a record delimiter -- you have to use
backslashes or other octet stuffing methods which are ugly and
expensive.

I do like the idea to have both named parameters and positional
parameters.  I've considered adding a similar feature to BLOB.


This was partially inspired by BEEP, I think. HTTP has similar feature
(request and status lines), but it is less exposed/obvious.

it can be a pain, because you can no longer "just print" the text
and the delimiters, you now need to emit byte-counted text.


... which is equivalent to "just printing" two things, the text length
and the text contents, so I do not sense the pain you are talking
about. Octet stuffing is much more painful because you cannot print
text at all. Instead, you have to print individual characters,
escaping some of them. I guess we just disagree on this [subjective]
subject.

- but they do have some disadvantages: you don't know the length of
a record before you start reading it either,


This is usually not a problem for performance-sensitive protocols
because their implementations read using raw data buffers anyway.


depends on whether the data elements are smaller than the buffers.


It is not important whether the data element fits (because you do not
parse it; you can stop reading or chain I/O buffers and such to
accommodate large data if needed). What is important is that the octet
counter fits, and it does because it is less than 16 octets long.

you don't have the problem with individual atoms so much as with
aggregates, and I don't see how your proposal supports those. (for
that matter, I don't know whether OPES needs them.)


The current syntax can support aggregates by adding or repeating named
parameters:

        s-a: first-atomic-component-of-s
        s-b: second-atomic-component-of-s

or
        l: first-list-element-of-l
        l: second-list-element-of-l

This is probably OK for occasional use, but I agree that we may need
more "direct" support if we find ourselves using the above hack too
often. As I mentioned above, it would be trivial to extend the
named-parameter syntax to support these:

        s: first-atomic-component-of-s second-atomic-component-of-s

        l: first-list-element-of-l second-list-element-of-l

Similarly, if you have protocol elements that are going to be
subjected to digital signatures and/or integrity checks, it's useful
if the application can treat those protocol elements as 'opaque' for
the purpose of signing/verification and not always have to deal with
them in decoded form.


Good point! Signing payloads should be OK. I think we do not have any
variability in the header syntax, except that a value can be quoted
even if it does not need to be.


can the order of fields be varied?  is there ever a need to group
several fields together and treat them as an aggregate?


Good point again! :-) Named-parameter order is not fixed and extension
parameters can be inserted, adding more variability to OCP "headers".
We will need to address this in a digital signature context. Added to
the to-do list.

be aware that having things "look like" MIME means that people will
treat them like MIME, and expect to be able to use MIME headers from
other protocols, wrap long lines like MIME does, add comments, use
encoded-words, etc.  there's a camel attached to that nose.


True. We already have some warnings against blind MIME usage.
Providing reference implementations and testing services may help a
bit as well. I am sure that no matter what we do, there will be
violations, but I would rather have clear violations of simple syntax
than hidden incompatibilities of MIME implementations.


Thank you,

Alex.