RE: OPES protocol, pre-draft


Hi,
I always thought of the OPES processor as an L7 switch (as in Markus's view) 
with limited storage capability. That limits all the storage capacity to the 
ends of the network. A web proxy server might be considered as a Content 
Provider (one of the ends of the OPES connection as in architecture draft) 
rather than as a OPES processor.

As a first attempt, I vote that the protocol that this WG comes out with 
considers the OPES processor as a bare-bones L7 switch, without much storage 
capability!

-Srini

-----Original Message-----
From: ext Oskar Batuner [mailto:batuner(_at_)attbi(_dot_)com]
Sent: Tuesday, March 18, 2003 10:04 AM
To: Markus Hofmann; ietf-openproxy(_at_)imc(_dot_)org
Subject: RE: OPES protocol, pre-draft

Markus,

I think we may have a different implementations in mind. I am 
looking at web proxy server (surrogate, I just do not like 
this word) extended by OPES capabilities. For this model 
storing all incoming data is not a problem - disks are 
large and cheap, and storing data is very natural behavior 
for the system build around cache engine.

Correct me if I'm wrong, but it looks like you have 
in mind something like layer 7 switch. Such devise may have 
better throughput but very limited storage capabilities. 
Main differentiator is ability to keep data on disk. Hybrid 
devices are also possible, e.g. solid state cache. More interesting 
hybrid is L7 based OPES processor combined with cache farm.

I suppose that buffering policy will depend mostly on the device 
type. OPES processor with disk will store all intermediate data 
and use it's caching capabilities to enhance overall performance. 
L7 switch based device will tend to be very conservative on 
storage use and may need to exploit protocol capabilities for 
copy control.

To support all these needs we may do several things:

1. Dynamic (per-message) control, like in current proposal.
2. Stateful protocol with storage policy negotiated at handshake.
3. Different level of protocol implementation with device capabilities 
announced (but not negotiated) at handshake. 

I thing protocol should support all 3 policies. This may significantly 
simplify implementation of cache-based OPES processors.

BTW, hybrid system with L7 switch based OPES processor and cache farm 
points to an interesting twist: why not to consider the attached web 
cache as a special case of callout server? This may be just another 
OPES scenario with OPES rules used to create and dynamically 
control caching policy. To support this scenario callout protocol 
should accommodate queries - transactions where message body 
is present 
only in response, and caching requests - message body only in request, 
and rules support persistence (if request was not in cache send 
storage request on response) and splitting (send content server 
response to end user without waiting for storage request to finish).

-----Original Message-----
From: owner-ietf-openproxy(_at_)mail(_dot_)imc(_dot_)org
[mailto:owner-ietf-openproxy(_at_)mail(_dot_)imc(_dot_)org]On Behalf Of 
Markus Hofmann
Sent: Monday, March 17, 2003 1:03 PM
To: ietf-openproxy(_at_)imc(_dot_)org
Subject: Re: OPES protocol, pre-draft

implementation that discards message chunk by chunk as soon as one 
is send to callout server does not look a good idea. Even if call 
starts before the complete message is received by the OPES

processor

it should be able to assemble the complete message anyway.


This might require the OPES processor to possibly buffer

huge amounts

of data, which might impact scalability of the approach.

Imagine a large number of users simultaneously downloading

large files

via HTTP, and the OPES processor does a callout for virus

scanning for

each of these files (in parallel). Since virus scanning

involves long

delays in processing, quite some data would accumulate at the OPES 
processor for temporary buffering...


Yes, but:

- virus scanning was used as an example of scenario where storage on 
OPES processor is most beneficial - virus scanner may give results 
after processing small part of the message eliminating most of 
the traffic;

- with the spiky nature of web tarffic you can not rely on 
scanner ability 
to keep pace with the traffic.

2.Reliability. If the original message is discarded by

OPES processor

before transaction with the callout processor is completed callout 
processor failure (including connection outage) forces

OPES processor

to fail on this message, while preserved copy provides multiple 
ways for recovery - default action on callout server unavailable, 
repeat request to a different callout server. And the message 
was already there, savings from premature discarding are very 
small.


Absolutely right, and that's why our architecture and

protocol should

allow an OPES processor to buffer the data. Then it's up to the OPES 
processor to decide whether to do that or not.

3. catch 22: if message processing is simple and anticipated 
processing time is small - so are the savings from message

discarding.

If anticipated processing time is significant - savings may be 
bigger but so are the risks related to failure and needs for 
reliable recovery.


Yup, see my comments above.

Also, for some transactions it might be OK to fail when the callout 
server fails. Example: If a user requests mandatory virus

scanning for

file download, it might be perfectly fine to reply with an error 
messge when the callout server running the virus scanning fails, 
rather then sending back a non-scanned version from the

buffer of the

OPES processor.


Yes, but stored message provides OPES processor with ability 
to recover 
after callout server failure (by sending the same request to another 
server) transparantly to other OPES flow participans.

Oskar