Open Pluggable Edge Services                     A. Barbir (Editor)
Internet-Draft                                     Nortel Networks
Expires: September 29, 2003                       March 31, 2003


                      OPES tracing facility
                      draft-ietf-opes-tracing


1. Introduction

The Open Pluggable Edge Services (OPES) architecture enables cooperative application services (OPES services) between a data provider, a data consumer, and zero or more OPES processors.  The application services under consideration analyze and possibly transform application-level messages exchanged between the data provider and the data consumer.

The execution of such services is governed by a set of rules installed on the OPES processor.  The rules enforcement can trigger the execution of service applications local to the OPES processor.

Alternatively, the OPES processor can distribute the responsibility of service execution by communicating and collaborating with one or more remote callout servers. As described in [], an OPES processor communicates with and invokes services on a callout server by using a callout protocol.

In [], the IAB has required the OPES working group to support tracing and notification. This document addresses these IAB requirements. The document examines the effect of tracing and notification requirements on OPES architecture and callout protocol []. In particular, the work identifies traceable entities in an OPES flow and how this information is relayed to end points.

As per the architecture document [], there is a requirement of relaying tracing information in-band. The document investigate this possibility and discusses possible methods that could be used to detect faulty OPES processors by end points on an OPES flow. 


The document is organized as follows: Section 2 considers ? Section 3? etc. 


2. Basic Definitions

- REFERENCE POINT - a reference that may be used out-of-band to 
  perform a specific function. 

  An example may be URI for the privacy policy, center of authority 
  URI, server address, etc. Usually no protocol is provided to access 
  the reference point.
  
- INFORMATION POINT - implies presence of the protocol to access 
  detailed information at this point. Example may be URI to get 
  a certificate for virus checker or content filter, examine 
  and set profile setting and active preferences.

- IDENTIFIER - provides a unique binding to detailed persistent 
  information. For example "transformation-applied : fe123" gives 
  a participant ability to enquire (and maybe cache) details of 
  the transformation fe123. Use of such (opaque) identifiers 
  does not require prior knowledge and does not create a burden 
  of storing additional information - this is just a tag for 
  persistent information (not message-specific).


3. Requirements for Notification in an OPES Flow


This section takes a look at the IAB requirements (3.1) and (3.2) and how they relate to notification

3.1 Notification Requirements

There are requirements on the architecture [] to assist content provider applications in detecting and responding to data consumer applications actions by OPES intermediaries that are deemed inappropriate by the content provider. This is referred to as notification.

In general, notification goes in opposite direction of tracing and cannot be attached to application messages that it notifies about. This can be done out-band and may require the development of a new protocol. In general, this opposite-direction, outside-of-message scheme is difficult to support. 

NOTE: When would a content provider issue such request? 
How would such mechanism be used? 
Randomly, or on a statistical basis? 
Or manually? Is such a scheme of practical relevance?


3.1.1 Notification Concerns

A major concern with notification is scalability. For example, it is not practical to assume that a content provider is interested in receiving a notification for every HTTP response sent out. As such, a mechanism for explicit request of notification May be required. 

Privacy is another concern. Maybe a user doesn't want to reveal to any content provider all the OPES services that have been applied on her behalf. For example, why should every content provider know what exact virus scanner a user is using?


3.2 How to Fulfill Notifications Requirements

IAB consideration (3.1) [] suggests that the overall OPES framework needs to assist content providers in detecting and responding to client-centric actions by OPES intermediaries that are deemed inappropriate by the content provider.

This requirement is hard to implement since most client-centric actions happen _after_ the application message left the content provider(s) and, thus, notifications cannot be piggy-backed to application messages and have to travel in the opposite direction of traces.

Note: Need to explain more here.

IAB consideration (3.2) [] can be satisfied by the development of a tracing facility. In this regard, it is recommended that tracing SHOULD be always-on, just like HTTP Via headers now. This should eliminate notification as a separate requirement.

If the OPES end points cooperate then notification can be supported by tracing. It is recommended that content providers that suspect or experience difficulties do the following:

	1. Check whether requests they receive pass through
	  OPES intermediaries. Presence of OPES tracing info
	  will determine that. This check is only possible for
	  request/response protocols. For other protocols (e.g.,
	  broadcast or push), the provider would have to assume
	  that OPES intermediaries are involved until proven
	  otherwise.

	2. If OPES intermediaries are suspected,
	  request OPES traces from potentially affected user(s).
	  The trace will be a part of the application message
	  received by the user software. If users cooperate,
	  the provider(s) have all the information they need.
	  If users do not cooperate, the provider(s) cannot
	  do much about it (they might be able to deny service
	  to uncooperative users in some cases).

	3. Some traces may indicate that more information
	  is available by accessing certain resources on the
	  specified OPES intermediary or elsewhere. Content
	  providers may query for more information in that
	  case.

	4. If everything else fails, providers can enforce
	   no-adaptation policy using appropriate OPES
	   bypass mechanisms and/or end-to-end mechanisms.


4. Requirements for Tracing in an OPES Flow


In [], the IAB has required that the OPES architecture provide tracing and debugging facilities. From [], the OPES architecture SHOULD assist consumer application in detecting the behavior of OPES processors and callout servers to potentially allow them to identify imperfect or compromised operations. 
 
The OPES architecture document [] has addressed these concerns at a higher level. The architecture requires that tracing be feasible on the OPES flow per OPES processor using in-band annotation. This requirement provides a participant with the ability to detect OPES intermediaries in the course of normal interaction. 

4.1 What is traceable?

End OPES points must be able to trace the following:
1. OPES processors that are involved.
2. OPES services (including callout services) that were performed on a request or response. 

3. TBD


4.2 Tracing and Trust Domains

Tracing is limited to trust domain. Entities outside of that domain may or may not see any traces, depending on domain policies or configuration. Therefore, there is no need for mandatory end-to-end tracing facility. For example, if an OPES system is on the content provider "side", end-users are not guaranteed any traces. If an OPES system is working inside end-user domain, the origin server is not guaranteed any traces related to user requests.

There are two distinct uses of traces. First, is to SHOULD enable the "end (content producer or consumer) to detect OPES processor presence within end's trust domain. Such "end" should be able to see a trace entry, but does not need to be able to interpret it beyond identification of the trust domain(s). 

Second, the domain administrator SHOULD be able to take a trace entry (possibly supplied by an "end? as an opaque string) and interpret it. The administrator must be able to identify OPES processor(s) involved and may be able to identify applied adaptation services along with other message-specific information. That information SHOULD help to explain what OPES agent(s) were involved and what they did. It may be impractical to provide all the required information in all cases. This document view a trace record as a hint, as opposed to an exhaustive audit.

Since the administrators of various trust domains can have various ways of looking into tracing, they MAY require the choice of freedom in what to put in trace records and how to format them. Trace records should be easy to extend beyond basic OPES requirements. Trace management algorithms should treat trace records as opaque data to the extent possible.

It is not expected that entities in one trust domain to be able to get all OPES-related feedback from entities in other trust domains. For example, if an end-user suspects that a served is corrupted by a callout service, there is no guarantee that the use will be able to identify that service, contact its owner, or debug it _unless_ the service is within my trust domain. This is no different from the current situation where it is impossible, in general, to know the contact person for an application on an origin server that generates corrupted HTML; and even if the person is known, one should not expect that person to respond to end-user queries.


4.3 In-Band Tracing

The architecture [] states that races must be in-band. This requirement limits the number of application protocols that OPES can adapt and the amount of details a trace record can convey.

The set of protocols that can support tracing for OPES Flow must be clearly documented. The architecture does not prevent implementers of developing out-of-band protocols and techniques to address the above limitation. 


4.3.1 Tracing information granularity and persistence levels

The information may be:

- message-related, e.g. "virus checking done - passed", "content filtering applied", "translated from quibbish to danqush". Such information should be supplied with each message and indicate that specific action was taken. All data that describes specific actions performed for the message should be provided with that message, as there is no other way to find message level details later. OPES application (including OPES processor and all application modules and callout servers involved) is not supposed to keep volatile information after request processing is done. 

- session related. 
TBD

Session level data must be preserved for the duration of the session. OPES processor is responsible for inserting notifications if session-level information changes. 

Examples of session-related information is "virus checker 
abcd build 123 enabled", "OPES server id=xyz present". 

- log information id. This may be used e.g. for accounting and non-repudiation purposes. Detailed information referenced by this id may be not available online but can be retrieved later by some off-line procedure.

- server related persistent information, e.g. "OPES center of authority <URI>", "privacy policy <URI>". It may be also presented once per session and it does not change between sessions.

- end-point related data: what profile is activated (profile ID), where to get profile details, where to set preferences. I'm not sure how far we should go in this direction. 


4.4 OCP Support for Tracing

It is the task of an OPES processor to add trace records to application messages. In this case, OCP protocol is not affected by tracing requirements for the following reasons:

a) Exclusive assignment simplifies the protocol.

b) There are use cases where callout services adapt payload regardless of the application protocol in use and leave header adjustment to OPES processor or other services. For example, think of a generic text translation or image modification service; such services require payload encoding knowledge but can be application-independent if OPES processor can supply them with just the payload.

c) OPES processor is always _able_ to trace its own invocation and service(s) execution because OPES processor must understand the application protocol. Assigning these tracing tasks to callout servers is just an optimization in cases where callout	servers manipulate application message headers.
d) May not be able to trace all services that are done at the callout server.
e) It makes OPES compliance checks easier when remote third	party callout servers are used.

f) Servers or services MAY add their own OPES trace records, of course.


4.5 Protocol Binding to Tracing

How tracing is added is application protocol-specific and will be documented in separate drafts. This work documents what tracing information is required and some common tracing elements.


5. Security Considerations