RE: draft-ietf-opes-ocp-core



On Thu, 28 Aug 2003, MITTIG Karel FTRD/DMI/CAE wrote:

Not exactly: my need is to be able to cache the OCP treatment
applied, which should be independent of the application protocol and
application caching information. See example below.


Thank you for providing a specific example. I think we are on the same
page as far as your requirements are concerned, but I still think your
needs can be addressed by existing HTTP cache controls. In fact, I
think we can do better than x%. Please see below.

Take an HTTP filtering service offered to 2 communities (for example
high and low schools using a filtering service), each passing
through the OCP processor. The aim of the service is to filter
Internet access but with a different level for each community.
Internet content can then be divided in 3 parts:

      - the content allowed&denied only for community 1 (say [E1])
      - the content allowed&denied only for community 2 (say [E2])
      - the content allowed&denied for communities 1 and 2 (say [I])

There are 2 ways to treat the problem:
<snip>

      - The second one is to say that the service uses a policy to
know which treatment to apply depending of the client. In this case,
there will only be one processor and one service, which is far more
interesting.

Normally, saying the processor can do HTTP caching, you will need to
call the service after the caching process ("response post-cache"
vectoring point) to avoid the cache storing a modified version which
should only be send to community 1 or 2. You don't have in this case
to modify cache control of responses, so it works fine.

The problem with this solution is that you will query your service
for each incoming request (or indeed each outgoing response). If
there are proxies in one community, you will then save corresponding
cached responses, but it will be really hard to predict the gain.

If you want to make some optimization, you can see that [I] content
could be cached by the processor. Or, this part can represent a lot
of queries, saying x%, so allowing the processor to cache
corresponding modified response will save x% load on your service.

But now, if you put your service before the caching process of your
processor, the processor won't be able to distinguish between
[E1]&[E2], even if it is able to store the 2 versions of responses,
because it doesn't (and doesn't have to) know service policy. So the
service has to tell processor not to cache this part. One way is to
do this as you suggested by modifying the response using protocol
parameters to say it is not cacheable.

The drawback in this case is that it has an impact on clients
applications. If one community uses proxies, they wont be able to
cache the responses any more. You will then increase the service
load related to these responses to an unknown level (depending of
the original TTL and the treatment required).

So the only way I see to be sure to gain those x% (which could be
for some services around 80%) is to add a simple "is-cacheable" flag
in OCP messages (without needing extended controls like application
protocols provide).


I believe HTTP already have the mechanism that supports the above
optimization. In fact, it allows you to cache all three types of
content! The mechanism is a Vary: header. If the caching proxy
supports caching of responses with Vary: header, then OPES can use it
as follows:

        - On the pre-cache response side, OPES needs to add a
          Vary: X-Client-Category
          HTTP response header. This is a very "cheap"
          modification that may even be supported by the
          HTTP proxy itself.

        - On the request side, OPES needs to set a
          X-Client-Category: <little_kids|high_school|general>
          HTTP request header, with one of the three appropriate
          header values. This is a very "cheap" modification
          as well, because the filter needs to determine
          client category to perform filtering checks anyway!

With the above scheme in place, the cache can store all content for
all clients. So can downstream caches. Will this work the way you want
it?

Note that you have to assume that there are no caches shared by both
communities in front of the cache in question. If there are shared
caches, neither your scheme nor Vary headers would work. Your scheme
will not work because it affects cachability at the current HTTP hop
only. My scheme will not work because all downstream requests will
have no X-Client-Category header. The only solution is then to mark
all controlled responses as uncachable using HTTP headers.

Did I miss anything?

Thanks,

Alex.