RE: processing points (was: OPES issues)


See my comments in line below.

-----Original Message-----
From: Rajnish Pandey [mailto:Rajnish(_dot_)Pandey(_at_)sun(_dot_)com]
Sent: Wednesday, June 06, 2001 2:02 PM
To: ietf-openproxy(_at_)imc(_dot_)org
Subject: OPES issues



        Caching issues related to the OPES architecture.
        ------------------------------------------------

Caching should be done on the basis of object along with services( +

version). For e.g. : url( uniform resource locators) + 
(service-version).


I like the idea of caching "url + service". But it might be an expensive
operation since the dimension of service could go infinete pretty quick. So
it is up to the cache vendor to choose a good engineering comprimise point
between caching all "url+service" and caching only "url".

I suppose part of that "service" actually means the exact service name(id),
version#, and processing-point it is intended for, etc. Now think about a
moment the 4 processing points we defined around a cache -- Does it make
sense to cache "url + service" at point 2 and 3?
I thought being at point 2 means after it leaves the cache

In few cases, keeping original document doesn't help.
E.g, Content Provider sends compressed object and surrogate 
is supposed
to decompress the compressed object. Here it makes sense to keep the
processed object in form of original object and remove the 
original object 
fetched from content provider.For further requests, 
decompressed object can be 
served. 

Suggested Solution :
------------
 To avoid the above problem, processing point 3 can be divided further
into two points : 3(a) and 3(b).
 
 3(a) :It contains Rules applicable to response from origin 
server only.
 
 3(b) :It contains Rules applicable to response from origin server or
cache( local/ ICP).
 
 Rules at point 3(b) would be applicable always irrespective  
of source
of objects.  Processed objects at point 3(a) should be kept 
as original
object. In the above case, decompressed object can be kept as 
original document
in cache and  can be used for serving requests.


I am confused here -- The 3(a) and 3(b) sound awfully like what we mean for
point 3 and 4. 
I thought all the difference between 3 and 4 is this -- 3 is before the
object enters the cache, 4 is after. So if it always makes sense to cache
the serviced object (like decompression in this example, or virus checking,
etc), you want to invoke the service at point 3. If it does not make much
sense to cache the serviced object (like personalized ad insertion), you do
it at point 4.

So it seems that the different between 3 and 4 is almost the hint to the
cache whether or not it should cache the serviced object.

I like the idea of using descriptive names instead of numbers for processing
point. I also would argue that for http we start with 2 mandatory names
(instead of 4)
-- REQUEST POINT
-- RESPONSE POINT
Hierarchically we can split the first further into
-- REQUEST POINT (BEFORE CACHE/AFTER CACHE)

The reason that I want to start with just 2 as mandatory is that if the rule
authors do not care much to specify the cache related behavior, they can get
by with just the two points. Then it is up to the specific cache+OPES to
figure out the right point, depending upon the cache's capability and the
service's nature. If the cache can truly handle "url+service", you can
always take REQUEST POINT as point 1 and RESPONSE POINT as point 3.

Am I making any sense at all? It seems like I am going against the crowd
here -- everyone is arguing for 4 or more points while I think only 2 is
essential for the IRML authors while 4 is internal to OPES.