Re: HTTP/OCP protocol binding

I think it is time to start thinking about application protocol
bindings. As we agreed, HTTP first.

This will help to validate OCP core and to detect some potential
problems. In addition we can then soon start to develop some
prototypes that help verifying whether the procotol works.


I agree and will start putting an HTTP binding draft together unless
somebody else wants to take the lead. However, I suspect that we may
need a more general "Common OCP data encodings" draft (see below) as
well. Or, should common encodings become a part of OCP Core?


I prefer to keep the number of documents someone needs to implement
the protocol at a minimum.

Here are some first questions regarding HTTP/OCP:

- Transactions for HTTP requests/responses
How do OCP transactions look like and differ if they are used at
activation points 1-2 and 3-4, i.e. OCP transactions for HTTP
requests and responses; or REQMOD vs. RESPMOD for us ICAP guys.


OCP Core transactions will look identical regardless of the activation
points because activation points are application-specific things
beyond OCP Core scope. If activation point is important, it has to be
passed as metadata or as an [HTTP- or cache-specific] extension OCP
message/field. HTTP binding draft will have to document how this is
done, I guess. An extension field to app-message-start message seems
most appropriate to me:

      app-message-start ...
      ...
      Activation-Point: request

or
      app-message-start ...
      ...
      Activation-Point: response

What do you think?


The activation point itself is not important (in most cases at least)
while it is of course important whether the message encapsulates only
a HTTP request or a HTTP response with original request as meta information.

- HTTP meta data

Will HTTP headers be simply the payload of a meta-have message? Is
the first line special? Will it be coded into named parameters of
meta-have messages? What about the empty line between HTTP header
and body? Does it belong to the meta data?


I hope there will be no meta-have messages (see my posting titled "OCP
metadata == data").


I forget about that one. Thanks for the reminder.

I would use some very simple encoding to pass HTTP
messages in OCP payload. Something along these lines:

      <chunk-type> <chunk-length> <chunk-data>

where "chunk-type" can be either of
      "headers":  HTTP headers including the first line and the last
                  CRLF terminator
      "trailers": HTTP trailers including the last CRLF terminator
      "body":     raw HTTP message payload
      "all":      raw HTTP message
and one OCP payload may contain several chunks in the above format.


For a HTTP response we need the original HTTP request as additional
meta data. That needs to be added somehow.

[...]

- Message length and transfer encoding

How to handle HTTP message body in chunked transfer encoding? Remove
the encoding before sending via OCP?


I think this should be left to implementations to decide. The
recipient MUST handle all valid HTTP encodings, but it is up to the
sender how to pre-process the message. Recall that, from OCP point of
view, any kind of preprocessing is out of scope.


While preprocessing is not in OCP scope, it should be o.k. to define
some requirements for application messages that may make some pre-
processing nececssary for some other types of the application message.
What I mean is: I think we can define "only identity transfer encoding
in this version of HTTP protocol binding of OCP" and so force OPES
processors to do some preprocessing for HTTP messages that have a
different transfer encoding.

Not saying that we should make such a rule. I am not 100% sure in
this moment. Could it be an option that is negotiated? Defining
which transfer encodings the callout server supports?


OCP agents can negotiate more specific requirements, I guess. For
example, they can negotiate that chunked encoding is always used or
that identity encoding is used.

What is with the Content-Length header? Who is responsible for
adding/changing/removing it?


The sender of the header. For example:

      - If a callout server sends "headers" or "all" chunks
        back to the OPES processor, then that callout server
        must ensure that all headers it sends match the body it
        sends. That includes adjusting or removing original
        Content-Length as needed. The OPES processor may
        further adapt the body and Content-Length, of course.

      - if an OPES processor sends just the "body" chunk to
        the callout server, then the OPES processor is responsible
        for matching the headers with the adapted body. This
        mode lets us implement services that are not HTTP-dependent.
        Note that data encoding will have to be negotiated in this
        case.

Asynchronous OCP data handling and persistent HTTP/1.0 connections
is not easy.


I do not see any complications. Note that we are not adapting
connections, only messages. Can you give an example where OCP and HTTP
persistency make things difficult?


What I saw with ICAP implementations:

- Some ICAP servers forget to adjust the Content-Length header although
they change the body length. ICAP clients have a hard time then.

- Often ICAP servers know that they are going to change something but
do not yet know when sending the HTTP response header back how big the
body will be. So, they need to delete the Content-Length header.
If this is a HTTP/1.0 message, the ICAP client needs to close the
connection; or it checks whether it can turn the message into a HTTP/1.1
message and adds chunked transfer encoding.
But most ICAP clients don't do this, so that ICAP servers started to
do the HTTP/1.1 transformation and add chunked encoding thing.
Big problem if the ICAP client does not support this because it is not
really HTTP/1.1 compliant.

I wonder whether we could make things clearer and easier and defining
who is responsible for what.
What about adding a parameter to the app-message-start message that the
OPES callout server returns that has the message size (content-length)
if already known at that time.
The OPES processor should be the only one being responsible to the
real HTTP message exchange with the HTTP client and can select whether
to close the connection, to keep it open, to change Content-Length header
or to add a transfer encoding. The OPES processor does not need to trust
the Content-Length header returned but the value of the new app-message-
start parameter, which if present signals that the callout server
knows what it does.

Regards
Martin