-----Original Message-----
From: Alex Rousskov [mailto:rousskov(_at_)measurement-factory(_dot_)com]
Sent: Wednesday, April 09, 2003 2:17 AM
To: ietf-openproxy(_at_)imc(_dot_)org
Subject: Re: Tracing Draft version-0004082003
On Tue, 8 Apr 2003, Abbie Barbir wrote:
Attached is the -00 version of the tracing draft.
Please keep in mind it is work in progress.
Feedback is required.
Abbie,
This is a great start, especially given a very short
time you had to write this first version of the draft!
Specific comments are inlined. I did not review the overall
structure of the draft because I think it is premature to do
that (need more "meat" and it is not yet clear whether some
of the current sections are going to stay).
N.B. Pease format using 72 character line length if possible -- it
makes it easier (for some of us) to quote and comment.
Thank you.
OPES tracing facility
How about just "OPES Tracing"? Does "Facility" mean something
specific? Will there be other (non "facility") OPES tracing drafts?
draft-ietf-opes-tracing
1. Introduction
The Open Pluggable Edge Services (OPES) architecture enables
cooperative application services (OPES services) between a data
provider, a data consumer, and zero or more OPES processors. The
application services under consideration analyze and possibly
transform application-level messages exchanged between the data
provider and the data consumer.
The execution of such services is governed by a set of
rules installed
on the OPES processor. The rules enforcement can trigger the
execution of service applications local to the OPES processor.
Alternatively, the OPES processor can distribute the
responsibility of
service execution by communicating and collaborating with
one or more
remote callout servers. As described in [], an OPES processor
communicates with and invokes services on a callout server
by using a
callout protocol.
In [], the IAB has required the OPES working group to
support tracing
and notification. This document addresses these IAB requirements.
IAB (RFC 3238) does not require support of anything. It lists
considerations that the WG should _address_:
"The purpose of this document is not to recommend specific
solutions
for OPES, or even to mandate specific functional requirements....
Instead, these are recommendations on issues
that any OPES solutions standardized in the IETF should be required
to address, similar to the "Security Considerations" currently
required in IETF documents [RFC2316]. As an example, one way to
address security issues is to show that appropriate security
mechanisms have been provided in the protocol, and another way to
address security issues is to demonstrate that no security issues
apply to this particular protocol.
There is a huge difference between "requiring support" and
enumerating "issues that OPES solutions should be required to
address". Let's not shoot ourselves in the foot. :-)
Also, RFC 3238 does not contain the word "trace" or
"tracing", just "notification".
I would suggest saying something like this:
IAB has required OPES solutions to address end user and
content provider notification concerns. This document
specifies tracing mechanisms that address those concerns.
It would be nice to explain somewhere why we are not calling
this document OPES Notification [Facility].
The document examines the effect of tracing and notification
requirements on OPES architecture and callout protocol []. In
particular, the work identifies traceable entities in an
OPES flow and
how this information is relayed to end points.
what information?
As per the architecture document [], there is a requirement of
relaying tracing information in-band. The document investigate this
possibility and discusses possible methods that could be used to
detect faulty OPES processors by end points on an OPES flow.
What about faulty callout servers? Do we have a term that
describes OPES system (processor + callout servers + whatever
else is out there that is OPES-related)?
The document is organized as follows: Section 2 considers ?
Section 3?
etc.
2. Basic Definitions
- REFERENCE POINT - a reference that may be used out-of-band to
perform a specific function.
An example may be URI for the privacy policy, center of authority
URI, server address, etc. Usually no protocol is provided to
access the reference point.
- INFORMATION POINT - implies presence of the protocol to access
detailed information at this point. Example may be URI to get
a certificate for virus checker or content filter, examine
and set profile setting and active preferences.
- IDENTIFIER - provides a unique binding to detailed persistent
information. For example "transformation-applied : fe123" gives a
participant ability to enquire (and maybe cache) details of the
transformation fe123. Use of such (opaque) identifiers does not
require prior knowledge and does not create a burden of storing
additional information - this is just a tag for persistent
information (not message-specific).
The above classification seems like a result of protocol
over-engineering to me. Would it be possible to avoid
introducing any classifications/terms until the draft starts
actually _using_ them for a specific purpose? This will save
us a lot of time -- there is no reason (and it is very
difficult) to discuss something that is not used (yet).
3. Requirements for Notification in an OPES Flow
This section takes a look at the IAB requirements (3.1) and
(3.2) and
how they relate to notification
3.1 Notification Requirements
There are requirements on the architecture [] to assist content
provider applications in detecting and responding to data consumer
applications actions by OPES intermediaries that are deemed
inappropriate by the content provider. This is referred to as
notification.
In general, notification goes in opposite direction of tracing and
cannot be attached to application messages that it notifies about.
If we compare notification with tracing like that, we should
talk about/define tracing first and only then provide a comparison.
An "opposite direction" illustration (figure) would be nice here!
This can be done
"This has to be done" ?
out-band and may require the development of a new protocol. In
general, this opposite-direction, outside-of-message scheme is
difficult to support.
What does it mean "difficult to support"? Consider removing
that sentence. (it's OK in a conversation, but not in a spec)
This text is of great importance because we are, essentially,
saying that the "ideal" scheme that IAB folks envisioned is
not practical. We need to be as specific as possible here.
NOTE: When would a content provider issue such request?
What request?
How would such
mechanism be used? Randomly, or on a statistical basis?
Or manually?
Is such a scheme of practical relevance?
In the above, there is no definition of the "mechanism"
detailed enough to answer these questions.
3.1.1 Notification Concerns
A major concern with notification is scalability. For
example, it is
not practical to assume that a content provider is interested in
receiving a notification for every HTTP response sent out.
As such, a
mechanism for explicit request of notification May be required.
Why is it not practical?! Some content providers would love
to know exactly what their clients are doing with their
content. They would be willing to double server capacity to
handle the load.
"Not scalable" usually implies non-linear (hopefully
exponential) growth with the number of messages or
notification-generation points. You need to show such growth
(or something similar) if you want to play the scalability
card. What does not scale and when?
Privacy is another concern. Maybe a user doesn't want to
reveal to any
content provider all the OPES services that have been
applied on her
behalf. For example, why should every content provider know
what exact
virus scanner a user is using?
Consider rephrasing to something like this:
End point privacy is a concern. An end user may consider
information
about OPES services applied on her behalf as private.
For example, if
translation for braille device has been applied, it can
be concluded
that the user is having eyesight problems; such information may be
misused if the user is applying for a job online.
Similarly, a content
provider may consider information about its OPES services private.
For example, use of a specific OPES intermediary by a high traffic
volume site may indicate business alliances that have not
been publicly
announced yet.
Also consider adding something like this:
Security is a concern. An attacker may benefit from knowledge
of internal OPES services layout, execution order, software
versions and other information likely to be present in
automated notifications.
Also consider adding something like this:
The level of available details in notifications versus content
provider interest in supporting notification is a
concern. Experience
shows that content providers often require very detailed
information
about user actions to be interested in notifications at all. For
example, Hit Metering protocol (RFC XXX) has been
designed to supply
content providers with proxy cache hit counts, in an
effort to reduce
cache busting behavior which was cause by content
providers desire to
get accurate site "access counts". The Hit Metering
protocol is not
widely deployed today because it turns out that content
providers are
not interested enough in "just hit counts"; only knowing
things like
each client IP addresses, browser versions, or cookies would make
providers interested enough to support cache hit
notifications. Hit
Metering experience is very relevant because Hit Metering
protocol was
designed to do for HTTP caching intermediaries what OPES
notifications
are meant to do for OPES intermediaries.
(We would need to verify the above info with Hit Metering
authors, but to the best of my knowledge it is correct)
3.2 How to Fulfill Notifications Requirements
IAB consideration (3.1) [] suggests that the overall OPES framework
needs to assist content providers in detecting and responding to
client-centric actions by OPES intermediaries that are deemed
inappropriate by the content provider.
This requirement is hard to implement since most client-centric
actions happen
What do we mean by "implement"? Write a spec? Code it up?
Deploy? Other? Consider rephrasing to something like:
To address this requirement directly, one would have to ...
and then finish with a statement that we are addressing it
indirectly by providing tracing mechanisms that assist
interested providers in detecting and responding to
inappropriate OPES actions. Say how they assist (you already
do the latter now, below).
_after_ the application message left the content provider(s) and,
thus, notifications cannot be piggy-backed to application
messages and
have to travel in the opposite direction of traces.
Note: Need to explain more here.
IAB consideration (3.2) [] can be satisfied by the development of a
tracing facility. In this regard, it is recommended that tracing
SHOULD be always-on, just like HTTP Via headers now. This should
eliminate notification as a separate requirement.
Why not MUST be always-on? We are talking about
interoperability here (a broken intermediary that does not
use Via-OPES headers is an interoperability problem because
it cannot be bypassed).
If the OPES end points cooperate then notification can be
supported by
tracing. It is recommended that content providers that suspect or
experience difficulties do the following:
Recommended is too strong, IMO. "For example, ..." or
"providers could ...", would be more appropriate.
1. Check whether requests they receive pass through
OPES intermediaries. Presence of OPES tracing info
will determine that. This check is only possible for
request/response protocols. For other protocols (e.g.,
broadcast or push), the provider would have to assume
that OPES intermediaries are involved until proven
otherwise.
2. If OPES intermediaries are suspected,
request OPES traces from potentially affected user(s).
The trace will be a part of the application message
received by the user software. If users cooperate,
the provider(s) have all the information they need.
If users do not cooperate, the provider(s) cannot
do much about it (they might be able to deny service
to uncooperative users in some cases).
3. Some traces may indicate that more information
is available by accessing certain resources on the
specified OPES intermediary or elsewhere. Content
providers may query for more information in that
case.
4. If everything else fails, providers can enforce
no-adaptation policy using appropriate OPES
bypass mechanisms and/or end-to-end mechanisms.
4. Requirements for Tracing in an OPES Flow
In [], the IAB has required that the OPES architecture
provide tracing
and debugging facilities. From [], the OPES architecture
SHOULD assist
consumer application in detecting the behavior of OPES
processors and
callout servers to potentially allow them to identify imperfect or
compromised operations.
The OPES architecture document [] has addressed these concerns at a
higher level. The architecture requires that tracing be feasible on
the OPES flow per OPES processor using in-band annotation. This
requirement provides a participant with the ability to detect OPES
intermediaries in the course of normal interaction.
4.1 What is traceable?
End OPES points must be able to trace the following:
Consider more accurate "The following entities can be
identified in a trace"
1. OPES processors that are involved.
2. OPES services (including callout services) that were
performed on a
request or response.
"... performed on an application message"
3. TBD
Also, we need to add MUST/SHOULD/MAY to each traceable
entity, I guess.
4.2 Tracing and Trust Domains
Tracing is limited to trust domain. Entities outside of that domain
may or may not see any traces, depending on domain policies or
configuration. Therefore, there is no need for mandatory end-to-end
tracing facility. For example, if an OPES system is on the content
provider "side", end-users are not guaranteed any traces.
If an OPES
system is working inside end-user domain, the origin server is not
guaranteed any traces related to user requests.
I am not sure about the above. It contradicts our statement
that we are addressing IAB concerns. If there is no trace, we
are not. I think it is reasonable to say that there MUST be
at least one trace entry per "system". (A trust domain may
include several such systems/entities, see the trust domain
definition).
There are two distinct uses of traces. First, is to SHOULD
enable the
"end (content producer or consumer) to detect OPES
processor presence
within end's trust domain. Such "end" should be able to see a trace
entry, but does not need to be able to interpret it beyond
identification of the trust domain(s).
Second, the domain administrator SHOULD be able to take a
trace entry
(possibly supplied by an "end? as an opaque string) and
interpret it.
The administrator must be able to identify OPES
processor(s) involved
and may be able to identify applied adaptation services along with
other message-specific information. That information SHOULD help to
explain what OPES agent(s) were involved and what they did.
It may be
impractical to provide all the required information in all
cases. This
document view a trace record as a hint, as opposed to an exhaustive
audit.
Since the administrators of various trust domains can have various
ways of looking into tracing, they MAY require the choice
of freedom
in what to put in trace records and how to format them.
Trace records
should be easy to extend beyond basic OPES requirements. Trace
management algorithms should treat trace records as opaque
data to the
extent possible.
It is not expected that entities in one trust domain to be
able to get
all OPES-related feedback from entities in other trust domains. For
example, if an end-user suspects that a served is corrupted by a
callout service, there is no guarantee that the use will be able to
identify that service, contact its owner, or debug it _unless_ the
service is within my trust domain. This is no different from the
current situation where it is impossible, in general, to know the
contact person for an application on an origin server that
generates
corrupted HTML; and even if the person is known, one should
not expect
that person to respond to end-user queries.
The above should have "system" granularity, not "domain"
granularity because there can be different privacy policies
within one trust domain.
4.3 In-Band Tracing
The architecture [] states that races must be in-band. This
requirement limits the number of application protocols that
OPES can
adapt and the amount of details a trace record can convey.
The set of protocols that can support tracing for OPES Flow must be
clearly documented. The architecture does not prevent
implementers of
developing out-of-band protocols and techniques to address
the above
limitation.
We should not (cannot) document the set of supported
protocols directly, IMO. We should document _requirements_
for application protocols that want to support OPES traces.
This is similar to OCP application bindings.
4.3.1 Tracing information granularity and persistence levels
The information may be:
- message-related, e.g. "virus checking done - passed", "content
filtering applied", "translated from quibbish to danqush". Such
information should be supplied with each message and indicate that
specific action was taken. All data that describes specific actions
performed for the message should be provided with that message, as
there is no other way to find message level details later. OPES
application (including OPES processor and all application
modules and
callout servers involved) is not supposed to keep volatile
information
after request processing is done.
- session related.
TBD
Session level data must be preserved for the duration of
the session.
OPES processor is responsible for inserting notifications if
session-level information changes.
Examples of session-related information is "virus checker
abcd build
123 enabled", "OPES server id=xyz present".
- log information id. This may be used e.g. for accounting and
non-repudiation purposes. Detailed information referenced
by this id
may be not available online but can be retrieved later by some
off-line procedure.
- server related persistent information, e.g. "OPES center of
authority <URI>", "privacy policy <URI>". It may be also presented
once per session and it does not change between sessions.
- end-point related data: what profile is activated (profile ID),
where to get profile details, where to set preferences. I'm
not sure
how far we should go in this direction.
The above classification seems like a result of protocol
over-engineering to me. Would it be possible to avoid
introducing any classifications/terms until the draft starts
actually _using_ them for a specific purpose? This will save
us a lot of time -- there is no reason (and it is very
difficult) to discuss something that is not used (yet).
4.4 OCP Support for Tracing
It is the task of an OPES processor to add trace records to
application messages. In this case, OCP protocol is not affected by
tracing requirements for the following reasons:
Either say "If it is the task..." or remove "In this case, " :-)
a) Exclusive assignment simplifies the protocol.
b) There are use cases where callout services adapt payload
regardless
of the application protocol in use and leave header
adjustment to OPES
processor or other services. For example, think of a generic text
translation or image modification service; such services require
payload encoding knowledge but can be
application-independent if OPES
processor can supply them with just the payload.
c) OPES processor is always _able_ to trace its own invocation and
service(s) execution because OPES processor must understand the
application protocol. Assigning these tracing tasks to
callout servers
is just an optimization in cases where callout servers manipulate
application message headers.
d) May not be able to trace all services that are done at
the callout
server.
e) It makes OPES compliance checks easier when remote third party
callout servers are used.
f) Servers or services MAY add their own OPES trace records, of
course.
I wonder if it is appropriate for the draft to explain the
motivation behind a decision, at such lengths? Should we just
state requirements instead?
4.5 Protocol Binding to Tracing
How tracing is added is application protocol-specific and will be
documented in separate drafts. This work documents what tracing
information is required and some common tracing elements.
5. Security Considerations
I would suggest adding rules from the message below (or
something similar). They are very specific things we can
discuss/fix/polish, and they actually shape the
conventions/intent of many tracing draft sections. The draft
should be mostly about specific requirements, not our
motivation or reasoning about what we might do and what
design alternatives we have available.
http://www.imc.org/ietf-openproxy/mail-archive/msg01875.html
Please note that I am not saying the above rules are perfect!
I am just saying we need more specific "bones" to grow the
draft "meat" around, or we will never exit the jelly state.
Thank you,
Alex.