RE: OPES issues

I am actually in favor of removing numbers and putting descriptive names.
This would make the rule file more readable and maintainable. Also the
rule compiler/processor can spit out an error if it sees a name it
cannot recognise or handle.

sherif

From: "Menon, Rama R" <rama(_dot_)r(_dot_)menon(_at_)intel(_dot_)com>
To: "'Rajnish Pandey'" <Rajnish(_dot_)Pandey(_at_)sun(_dot_)com>, 
"'ietf-openproxy(_at_)imc(_dot_)org'"

<ietf-openproxy(_at_)imc(_dot_)org>

Subject: RE: OPES issues
Date: Thu, 7 Jun 2001 20:50:15 -0700 
MIME-Version: 1.0
List-Archive: <http://www.imc.org/ietf-openproxy/mail-archive/>
List-Unsubscribe: 
<mailto:ietf-openproxy-request(_at_)imc(_dot_)org?body=unsubscribe>
List-ID: <ietf-openproxy.imc.org>

On processing points:

In my reading through the memo below (and hearing statements about creating
multiple processing points with-in the OPES controller for processing point
1 - the situation similar to cache 3(a) and 3(b) processing points mentioned
below - during the OPES workshop on 6/7/01), I have few
comments/observations:-

(1) Is the current thinking on 4 processing points outdated? Looks like it.
(2) Cache is just one of the "applications"/"services" with-in the OPES
framework; what about those yet to be analyzed/conceived? Do we want to
imply that all new services/applications play by the current "N# of
processing points" model (N= 4 or 6 or 8)? Does this really need to
articulated as N?

(3) Why not leave the # of processing points beyond the basic 4 as
service/application specific? IMHO, it is a question of logical
partitioning/hierachy of rules at each processing point that an
application/service might need to implement - we don't want to extend that
to a variable # of processing points... at least in the model definition(s) 

Comments?

- Rama
******

-----Original Message-----
From: Rajnish Pandey [mailto:Rajnish(_dot_)Pandey(_at_)sun(_dot_)com]
Sent: Wednesday, June 06, 2001 2:02 PM
To: ietf-openproxy(_at_)imc(_dot_)org
Subject: OPES issues

       Caching issues related to the OPES architecture.
       ------------------------------------------------

In present scenario, caching is done on the basis of object fetched 
from origin server or other parent intermediary. Under OPES environment,
objects
fetched from origin server/ intermediary are processed( local or remote) by
value added services such as language translation, content adaptation. To
reduce
latency, processed objects along with  the original object can be cached,
which
can be used to serve other requests.

CACHING
--------
Caching should be done on the basis of object along with services( +
version). For e.g. : url( uniform resource locators) + (service-version). 
Version number must be taken into account while maintaining cache.This is 
necessary because if a service gets upgraded , object processed by the
earlier 
version of service becomes stale( logically). But time stamp of cached
document
shows the object as fresh.To make sure that, fresh object is served,VERSION 
number must be taken into consideration in maintaining cache.

(Version and service can be taken from the description in the OMML)

Under OPES environment, cache may have different forms ( due to
processing from various services) of objects. If the object retrieved from 
CP(content provider) is processed by four services ( s1-s4), then these 
processed objects ( provided they are cachable)  CAN be cached along with
the 
original object. And cache will have url, url +s1, url +s1 +s2, url +s1 +s2 
+s3,url +s1 +s2 +s3 +s4.
These processed objects can be used later.

How cached objects are used to serve requests.
--------------------------------------------------
Cache is used to serve requests for both type of objects.(original and
processed).
Consider few cases.

Case 1: ( present scenario)
   A request comes for an object(url) without service.
   If the requested object is present in cache, it is served or origin
server 
is contacted for the object and is served.

Case 2:
   A request comes for url + service1 and, cache has not got the 
original document. Intermediary fetches the original object and processes
it. 
The processed object is served to client. At the same time, processed object
is
kept into cache along with the original object. Original object must be kept
into cache provided its cachable. It can be used later(provided it
remains fresh), if request comes for the original document( url) or url +s2
or url  +s2 +s3.It can be also used if the processed object is not
fresh.
   In this case,it can be used to execute services over it and serve
requests.

Case 3:
   Also, processed ( by services ) objects  can be used for further
service (different) processings.Suppose, cache has got url, url + s2 and
request comes for url + s2 + s3.Then, url + s2 can be used to generate url
+s2 +s3 and serve client. At the same time, it can be maintained in cache
also, which can be used to serve same request or url + s2 +s3 +sn.

In few cases, keeping original document doesn't help.
E.g, Content Provider sends compressed object and surrogate is supposed
to decompress the compressed object. Here it makes sense to keep the
processed object in form of original object and remove the original object 
fetched from content provider.For further requests, decompressed object can
be 
served. 

Suggested Solution :
------------
To avoid the above problem, processing point 3 can be divided further
into two points : 3(a) and 3(b).

3(a) :It contains Rules applicable to response from origin server only.

3(b) :It contains Rules applicable to response from origin server or
cache( local/ ICP).

Rules at point 3(b) would be applicable always irrespective  of source
of objects.  Processed objects at point 3(a) should be kept as original
object. In the above case, decompressed object can be kept as original
document
in cache and  can be used for serving requests.

Modifications of cache related message( response) headers.
---------------------------------------------------------
Caching under OPES environment must  obey all the cache related message
headers.

A service can also affect caching behaviour of an object. It can make
the cachable object as non cachable and reduce the freshness time of an
object. But it must not make a non-cachable  object as cachable and also
must 
not increase the freshness time of an object. 
Cache related headers in request messages must not be modified.

SERVICES INACCESSIBLE TEMPORARILY.
----------------------------------
Sometimes services may not be available temporarily. Consider a case in
which a client has asked for a remote service, but it is not available. 

Following are the actions that may be taken.

--Fresh objects can be served without processing.

--Processed but stale objects can be served from cache along with
warning.

But, these actions should be specified by rule author in rule file(IRML) .

If none of the services is accessible, then OPES may move forward and 
start processing for different rules.

E.g : : Content provider has put a rule to access a decompression
service.If the service is not available, then original object (
uncompressed) 
SHOULD not be served.In this case, either stale object or some error message

SHOULD be given.

There may be few cases.

Case 1:
------

Request comes for url + service. url + service is not available in
cache. Also, service is not available. Now, since service is not present,
then
rule in succession can be checked for match and action can be taken
accordingly.

But, few services are imperative. It is necessary to fire such
services.For e.g: Content Adaptation.

Suggested Solution : 
-----------
An attribute "MANDATORY" can be added to element "ACTION". Possible
values of this can be YES and NO.By default, its value will be NO. If this
attribute is not mentioned, its value will be taken as NO, and message will
be 
sent for validation  against different rules.If it is YES, then this ACTION
MUST
be fired and if not accessible, then some error may be sent.

Case 2:
------
 Request comes for url + service . url + service object is available in
cache, but is stale . Service is not available.Now decision has to be taken 
whether,stale object can be served or object without processing can be
served.

Suggested Solution : 
------------

 An attribute "CACHE" can be added to element "ACTION". Possible values
will be YES and NO. By default, its value   will be YES.This implies that,
if 
attribute is not mentioned in rule file, then stale data can be
served.If it is NO, then, original object should be served without
processing.

There should be provision of alternate services, which can be contacted
when service mentioned in ACTION element is not available.
It can be added as a new element such as <alt_action> or can be
provided in the action element itself as ORing of different services.

This is the existing format in IRML. 

  <rule processing-point=3>  
    <!- Is the requested Web resource an executable binary file? -->  
    <property name="Content-Type" matches="application/">  
      <!- Invoke virus scanning service on mcaffee.com -->  
      <action>icap://mcaffee.com/viruscheck</action>  
    </property>  
  </rule>  

Changes suggested 
-----------------

1.Addition of MANDATORY attribute for action

  <rule processing-point=3>  
    <!- Is the requested Web resource an executable binary file? -->  
    <property name="Content-Type" matches="application/">  
      <!- Invoke virus scanning service on mcaffee.com -->  
      <action  name="MANDATORY" matches="YES"> 
          icap://mcaffee.com/viruscheck
      </action>
                              // MANDATORY attribute being added      

    </property>  
  </rule>  

2.Addition of CACHE attribute for action                

  <rule processing-point=3>  
    <!- Is the requested Web resource an executable binary file? -->  
    <property name="Content-Type" matches="application/">  
      <!- Invoke virus scanning service on mcaffee.com -->  
      <action  name="CACHE" matches="NO">
      icap://mcaffee.com/viruscheck
      </action>        
                                      //CACHE attribute added
    </property>  
  </rule>  

3.1 Definition of Alternate service  in "Action" element

  <rule processing-point=3>  
    <!- Is the requested Web resource an executable binary file? -->  
    <property name="Content-Type" matches="application/">  
      <!- Invoke virus scanning service on mcaffee.com -->  
      <action>icap://mcaffee.com/viruscheck | 
      icap://foo.com/viruscheck
      </action> 
               //Alternate service is provided in Action field only 
    </property>  
  </rule>  

    But according to IRML draft, 
    Action element is defined as  "Only one service URI MAY be
specified per "action" element. "

3.2 Definition of Alternate service  in a different element  within
"Action" element

  <rule processing-point=3>  
    <!- Is the requested Web resource an executable binary file? -->  
    <property name="Content-Type" matches="application/">  
      <!- Invoke virus scanning service on mcaffee.com -->  
      <action >icap://mcaffee.com/viruscheck </action>
       <alt_action> icap://foo.com/viruscheck </alt_action>   
          //this service is a substitution to service   mentioned in
                      //"action  " element.
    </property>  
  </rule>