ietf-openproxy
[Top] [All Lists]

RE: transfer- and content-encoding

2003-10-10 14:19:33

I got some questions when I wanted to document negotiation on
transfer-encoding and content-encoding. Both are planned to be named
parameters in the feature structure of NO and NR messages.

Example:

   NO ({"38:http://iana.org/opes/ocp/HTTP/response";
   Transfer-Encodings: (chunked)
   Content-Encodings: (gzip, compress)
   })
   SG: 5;
   NR {"38:http://iana.org/opes/ocp/HTTP/response";
   Content-Encodings: (compress)
   }
   SG: 5;

So, what does this mean?

The OPES processor advertises its capability to handle chunked
transfer encoding. So, it has the knowledge how to remove this
encoding, i.e. how to preprocess the data for the callout server in
order to transfer the data without transfer encoding (with identity
encoding). Because transfer encoding is hop-by-hop, we can assume
that an OPES processor can do this preprocessing for every
transfer-encoding it supports; the data will not come in any
transfer-encoding that is unknown to the OPES processor. This
negotiation is also a hint for the callout server, because it MAY
introduce a transfer encoding that is handled and advertised by the
OPES processor but not anything else. That all seems to work.

Is the following summary accurate?

Yes.


Transfer-Encoding list sent by OCP agent:
      Advertises encoding and decoding capabilities
      Accepts and generates only listed encodings
      Encodings listed earlier are preferred.
      Defaults to (identity)

Note that HTTP/1.1 default is (identity, chunked).

Yes.


Also, how do we handle a situation where multiple transfer encodings
are applied?

If data is encoding with transfer encoding X and then again with encoding Y,
then this must happen.

If the other agent advertises both X and Y, then data can be sent with
double encoding.
If the other agent advertises only X, then message needs to be Y-decoded,
staying X-encoded.
If the other agent advertises only Y, then message needs to be decoded
completly, re-encoding to Y is possible.
If the other agent advertises neither X not Y, then message needs to be
decoded completly.


Regarding content encoding:

How about this simpler version:

      A callout server MAY send a Content-Encodings list to
      indicate its preferences in content encodings. Encodings
      listed first are preferred to other encodings. An OPES
      processor MAY use any content encoding when sending
      application messages to a callout server.

      If an OCP agent receives an application message that it
      cannot handle due to specific content encoding, the usual
      transaction termination rules apply.


Fine with me. It also means that the OPES processor does not use
the Content-Encodings header at all.
We just still need to allow that the callout service still handles
the data (for example replacing by an error message or returning unchanged)
even if it does not support that encoding, rather than termination the
transaction.


An alternative is to make early termination by processor
possible:

      A callout server MAY send a Content-Encodings list to
      indicate its requirements for supported content
      encodings. Encodings listed first are preferred to other
      encodings. A special "*" identifier stands for any
      encoding. Empty list indicates that the callout server is
      not capable of handling any content encodings and, hence,
      does not want to receive any content. Absent parameter
      defaults to ("*").

      An OPES processor MUST use a content encoding supported
      by the callout server or MUST terminate (or not initiate)
      the transaction that uses unsupported content encoding
      for application data.

      If an OCP agent receives an application message that it
        cannot handle due to specific content encoding, the usual
        transaction termination rules apply.

Which one will work better?

The first one.
A typical callout service only acts on certain media types.
Let's say it is a text translator. Any image will be returned unchanged.
Termination of the transactions seems to be inappropriate if an image
is in an encoding that the callout server does not support.

An opposite example is a virus scanner that wants to parse all files.
If it receives one in a content encoding that it does not understand it
could always replace the content by an error message which seems still
nicer to me than termination of the OCP transaction which does not define
what is done with the HTTP message.
BTW: A callout service that does not support common content encodings
but inists to understand the data of all transferred messages should
better already adapt the HTTP request and strip unsupported encodings
from the Accept-Encodings header.



Also, how do we handle a situation where multiple content 
encodings are
applied?

Not an issue when following your first option. Or is there one that I do
not see?

Regards
Martin