RE: HTTP/OCP protocol binding



On Mon, 2 Jun 2003, Martin Stecher wrote:

First, I am not so sure about the "all" chunks. Do we really need
them? While having multiple options and being able to negotiate
things is great, especially for OCP core, I think that the
application protocol bindings should define a single way how to
vector the data to the callout server. If both OPES processor and
callout server agree on HTTP/OCP they have to understand at least
the basics of that protocol and so being able to differ between
headers and bodies. If actors in the OPES scenario cannot find a
matching protocol binding that both support, there could/should be a
File/OCP and Raw-TCP/OCP that may work out sometimes as fallback.


Good point. The reason to keep "all" chunks could be OPES processor
performance. In many cases, it may be more efficient for the processor
to send data as it received it (raw), even though it had parsed it
internally. Nevertheless, I agree that "all" chunks should be removed
from HTTP binding until we have a better HTTP-specific justification for
them.

On the other hand, I think we should define HTTP binding in such a way
that it would be possible to use data (OCP payload) encodings from other
bindings. It should be possible to negotiate data encoding after
negotiating HTTP binding (HTTP binding comes with a default encoding, of
course). This will make it simple to use "raw" and other useful
encodings for HTTP adaptations. We should try to make data encoding (and
other things) reusable across application bindings.

Second, is it necessary that the application procotol chunks
encoding is another abstraction layer on top of the data-have
messages? Why do we not just introduce a named parameter
(Content-Type or Payload-Type) for data-have and use this core
message type to send the chunks? Chunk length is then without
additional cost and no extra encoding needs to be parsed.


It is possible to move the data type field to OCP. Doing so will limit
you to one chunk per OCP message, but that is probably not such a big
deal if we keep data-have messages efficient. It certainly simplifies
the protocol so we should try to do that, I guess.

The protocol binding will then define which payload types exist,
whether there is a given order and whether multiple data-have
messages per payload type are allowed.


I suggest that we keep the App-Message-Part parameter HTTP-specific for
now. Not all application protocols will have different "parts" of
messages to be concerned about or will be able to describe those parts
with a simple parameter value.

It probably makes no sense to handle a 1xx as its own transaction.
It could be another response header but we will need to explain this
clearly in the draft otherwise we will get interop problems because
many implementations will probably oversee this point.


Treating 1xx as headers is a kludge that may break services, especially
those that mingle with headers. Moreover, you would still need some way
to distinguish 1xx and "normal" headers unless you want the callout
server to rely on response lines as separators. Finally, some callout
servers might _generate_ 1xx responses and, hence, may prefer a more
"direct" approach (but response generation is a much bigger problem, of
course).

Overall, I suspect we should treat a 1xx response as an application
message it is, just like a 200 OK response, but we can postpone this
debate.

Regarding the transfer encoding:

Seems that we agree that OPES processor and callout server will
negotiate which is supported; identity MUST be supported. OPES
processor MAY/SHOULD have some pre-processing to remove a transfer
encoding that it supports but the callout server doesn't.


Yes.

Regarding content length and persistency:

We agree that the OPES processor is responsible to handle HTTP
persistency and that it is impossible to have a protocol which is so
robust that it overcomes all buggy implementations.


Yes.

But I think that we can reduce the problem and enforce the OPES
processor responsibility by defining:

 - the OPES processor MUST ignore the Content-Length header returned
   in the response header chunk from the callout server


This seems too restrictive to me. If, for example, the same company
makes the OPES processor and the callout server, they would want to
avoid extra work and would prefer to honor the Content-Length header
generated by the callout server they trust as much as the processor.

Perhaps you mean that sizep parameter should be used in this case, and
the Content-Length header should still be ignored.

 - a callout server SHOULD indicate the application message length
   via the sizep parameter in the start-app-message if it is known at
   that time.


Why not a MUST? If something so important is known, why not report it?

 - the OPES processor SHOULD honor the sizep parameter and use it to
   adjust the Content-Length header;

OK.

otherwise it MUST remove the Content-Length header; it MAY then
switch to chunked transfer encoding if supported by the client for
being able to keep the HTTP connection alive.


Or it can try to buffer the entire response first to calculate the
true Content-Length value.

With this approach many callout services that only concentrate on
the HTTP body can ignore the header and use data-as-is rather than
parsing the header, finding and adjusting the ContentLength header,
reassembing the header and sending it back.


True, but it seems we are building a very specific/rigid/low-level
solution to the part of the problem where a more general and flexible
solution is possible. What you describe in the above paragraph hints at
the true problem at hand. Some (many?) services have these
characteristics:

    - they need to know message "headers" (processor must send them)
    - they modify "content" or "body"
    - their modifications bring original "header" out of sync
    - they would prefer not to sync message "headers"
    - we think it may be safer for OPES processor to sync the headers

What we may need is a mechanism for the callout server to tell OPES
processor to [re]compute the adapted headers while still providing
original ones:

    * data-i-have
      I tell you what to send;
      I believe the data I have is accurate
      (this is the current data-have message)

    * data-you-have
      I tell you what to send, but I modified nothing, so just
      use your own copy you told me you have;
      I believe the data I have is accurate
      (this is the current data-as-is message)

    * data-you-check
      I tell you what to send;
      I believe the data I have is not accurate enough so please
      check and recompute everything you can as needed

An application binding can use the above mechanism in addition to sizep
and other features to require OPES processor to recompute "headers" when
a callout server sends then via data-you-check OCP message.

The above "content length" thoughts are very unpolished. We need to work
more on this...


Thank you,

Alex.