Hi,
we just submitted attached draft on "Requirements for OPES Callout
Protocols" as an attempt to get a discussion on callout protocol
requirements going. Please post any comments and feedback to this list.
-Markus
Internet Draft A. Dracinschi Sailer
Expires: May, 2002 Lucent Technologies
V. Hilt
Document: Univ. of Mannheim
draft-dracinschi-opes-callout-requirements.txt M. Hofmann
Lucent Technologies
R. R. Menon
Intel
Category: Informational November 14, 2001
Requirements for OPES Callout Protocols
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups MAY also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and MAY be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
In the context of the Content Networks, the Open Pluggable Edge
Services represents an infrastructure that enables quick and easy
creation of value-added networking services. This document attempts
to present requirements for callout protocols that provide
communication between an in-path OPES intermediary (e.g. a cache)
and remote callout servers.
Table of Contents
1 Terminology....................................................2
2 Introduction...................................................2
3 Design Considerations..........................................3
3.1 Basic Requirements...........................................3
Dracinschi Expires MAY 2002 [Page 1]
Internet Draft Callout Protocol Requirements November 2001
3.1.1 Service identification......................................3
3.1.2 Message exchange style......................................3
3.1.3 Message context.............................................3
3.1.4 Payload transparency........................................4
3.1.5 Pipelining requests.........................................4
3.1.6 Message segmentation........................................5
3.2 Increasing Efficiency........................................5
3.2.1 Caching responses...........................................5
3.2.2 Channels....................................................5
3.2.3 Buffering messages..........................................6
3.2.4 Preview.....................................................6
3.2.5 Partial content.............................................7
3.2.6 Multiple services on the same message.......................8
4 Security Considerations........................................9
5 Acknowledgments................................................9
6 References.....................................................9
7 Author's Addresses.............................................9
Full Copyright Statement..........................................10
1 Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL ", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [1].
OPES related terms are to be interpreted as defined and used in [2].
2 Introduction
Content Networks, also known as Content Distribution Networks or
Content Delivery Networks (CDNs), are of increasing importance to
the overall architecture of the web. CDNs support improving the
delivery of content from an origin server to content consumers.
Content networks can be seen as an overlay network on top of the
traditional packet network infrastructure. Similar to the CDN space,
there exists need for delivering a variety of services to
corporate/enterprise Intranets. [2] introduces Open Pluggable Edge
Services (OPES), an infrastructure for adding valuable content
services to a CDN or an Intranet. Examples of such services include
dynamic content assembling at the network edge, URL filtering,
language translation, location-based services, content adaptation
for different devices based on device characteristics, privacy
services, etc.
This document presents requirements for callout protocols in the
context of the OPES architecture. A callout protocol supports
message exchanges between an in-path OPES intermediary and a remote
callout server. Intermediaries are application gateway devices
located in the path between a client and an origin server. Caching
proxies are probably the most commonly known and used intermediaries
Dracinschi, et. al. Expires MAY 2002 [Page 2]
Internet Draft Callout Protocol Requirements November 2001
today. A remote callout server is a cooperating server that runs
OPES service modules on behalf of an OPES intermediary. Remote
callout servers are usually employed in an OPES framework to either
offload the OPES intermediary for better scalability or to provide
value-added services not available on either the origin server or
the OPES intermediary.
Section 3 describes the attempts to summarize the requirements for
such callout protocol.
3 Design Considerations
3.1 Basic Requirements
A callout protocol's primary purpose is to efficiently forward, from
the intermediary to the remote callout server, request/response
messages exchanged on the content path (e.g. HTTP, RTSP, or RTP
messages) and information about the service to be executed on those
messages at the remote server. In order to fulfill this task, a
callout protocol SHOULD consider the following design issues:
service identification, message exchange style, message context,
payload transparency, pipelining and message segmentation.
3.1.1 Service identification
A callout protocol MUST be able to uniquely identify a remote
callout service that is required to be executed on a message. An
adequate way to provide such identification MAY be a URI. Such a URI
MUST contain the complete hostname and the path identifying the
service requested. The method of determining the name of an
appropriate service is outside of the scope of a callout protocol.
An example for a URL is ucp://my.callout-server.com/service1
3.1.2 Message exchange style
A callout protocol MUST implement a request/reply communication
style. Initiating a callout always requires a request containing the
encapsulated message (or parts of it) to be transferred to a callout
server. In turn, this server MUST always send back a response either
containing the unmodified message, a modified version of the
message, a status code (that triggers a certain reaction from the
intermediary) or an error code.
3.1.3 Message context
Some remote callout services require additional information to
perform their service. One example for such information is the HTTP-
request for a service that is operating on a HTTP-response. Another
example is a command line parameter (e.g. the destination language
for a translation service). In general, a message context could be
any information available in the local execution environment that is
needed by a remote callout service.
Dracinschi, et. al. Expires MAY 2002 [Page 3]
Internet Draft Callout Protocol Requirements November 2001
Basically, there are two methods of transferring the message context
to the remote server: first, it can be part of the URL (e.g. as user
id, additional path elements or a query parameter) with which a
service is invoked. An example of such a URL is
ucp://volker(_at_)my(_dot_)callout-server(_dot_)com:8080/translation-
service/fast_translation?lang=german. The second possibility to
transfer the message context is within a separate field of the
request header. As with the payload, no assumptions SHOULD be made
on the type or structure of the message context field. Instead, the
message context SHOULD be taken as binary data that is encapsulated
in the request. An example for information in such a header field is
the HTTP-request that is shipped along with a HTTP-response.
Both methods of transferring the message context have their
advantages and disadvantages. Transferring the message context
within the URL is simple and produces very low overhead. However,
the size and complexity of information contained in a URL is limited
(e.g. encoding a HTTP-request within a URL might not be a good
alternative). Using a separate header-field introduces some overhead
but is much more flexible than using a URL.
Although it would be possible to let the callout server modify parts
of the message context and return it along with the response, this
SHOULD NOT be allowed. It would substantially increase the
complexity of an intermediary since the intermediary would need to
assure the consistency of the message context especially if multiple
requests are issued in parallel.
3.1.4 Payload transparency
A callout protocol SHOULD make no assumptions about the protocol
used on the content path (in particular, it SHOULD NOT assume that
this protocol is HTTP). Instead, a callout protocol SHOULD take the
content path protocol messages as binary data and encapsulate these
messages during the transfer to and from a remote callout server.
This requirement does not prevent a design, where a basic callout
protocol captures common aspects of the callout process and an
additional payload specification tailors this basic protocol to the
needs of a certain content path protocol (similar to the model used
by RTP). Nevertheless, the basic callout protocol SHOULD be
independent of the protocol used on the content path.
If possible, a callout protocol SHOULD also not assume a certain
communication pattern (e.g. request/reply) to be used on the content
path. The rationale behind the payload transparency is, that a
callout protocol SHOULD be capable of handling different content
path protocols to avoid the re-implementation of similar
functionality for each of these protocols. Examples of common
content path protocols are HTTP, RTSP, SMTP, NNTP, and RTP.
3.1.5 Pipelining requests
Dracinschi, et. al. Expires MAY 2002 [Page 4]
Internet Draft Callout Protocol Requirements November 2001
It is very likely that a remote callout service is called many times
in sequence with a very short time in between two single requests.
For example an ad insertion service might be called for every HTTP
message passing through an intermediary. For this reason, a callout
protocol MUST be capable of issuing a request without having
received the response for a previous request. In other words, the
protocol MUST be capable of pipelining multiple requests.
3.1.6 Message segmentation
The messages exchanged on the content path can be of very large
sizes. Examples are huge web pages, PostScript or PDF documents,
audio and video clips and streamed audio and video. Usually, these
messages are segmented and transferred in a stream of small packets.
For example, HTTP supports this type of transmission with its
chunked transfer encoding. A callout protocol SHOULD be able to
redirect the segments of a message to the callout server as soon as
the intermediary receives them. The intermediary SHOULD NOT try to
receive the entire message before it is sent to the callout server.
This would substantially increase the processing time of one message
and it would not be possible at all for media streams. An
implication for the protocol design is that the size of messages is
not known at the time the first packets are sent to the callout
server.
3.2 Increasing Efficiency
Typically, an intermediary has to handle large amounts of network
traffic. Depending on the rule configuration and the services
provided, a significant part of this traffic may be sent to a remote
callout server. For this reason, efficiency SHOULD be one of the
major design goals for a callout protocol. Performance measurements
on the ICAP protocol indicate that the vast majority of processing
time is spent copying messages from the content path to the callout
server and back. Thus, the efficiency of a callout protocol can be
increased if the amount of data that has to be transmitted is
minimized. The following concepts MAY help to achieve this goal.
3.2.1 Caching responses
A callout protocol SHOULD support the caching of responses. To do
so, a remote callout server MUST be able to indicate if and how long
a response MAY be cached by an intermediary. If a response is
cacheable and still valid, an intermediary MAY satisfy identical
requests by using the cached response. Determining which requests
are identical is outside of the scope of a callout protocol. If a
server has allowed the caching of a response for a certain period of
time, there is no means for it to revise this decision.
3.2.2 Channels
Since it can be assumed that an intermediary sends a large number of
requests to a remote callout server, it is reasonable to open a
persistent channel to a remote callout server over which all
messages are transferred. This will substantially reduce the network
Dracinschi, et. al. Expires MAY 2002 [Page 5]
Internet Draft Callout Protocol Requirements November 2001
overhead for the transmission of one message. An intermediary might
decide at which time it opens or closes a channel. A reasonable
policy might be to establish a channel at the time the first request
for a service is received and to close the channel after a timeout.
The policy of opening and closing a channel SHOULD NOT be part of
the protocol.
During the creation of a channel, an intermediary has the chance to
negotiate service parameters, associated with that channel, with the
remote callout service. These parameters apply to all messages
exchanged over that channel. Examples of such parameters are the
service URI, the payload type, or the service context. Exchanging
this information once at the channel setup reduces some of the
protocol overhead. Although these savings are not really big, they
come at almost no cost. Furthermore, negotiation of parameters can
be accomplished during channel creation while this might become
time-critical if attempted for each message.
3.2.3 Buffering messages
An intermediary MAY keep a local copy of the message it has sent to
a remote callout server. This allows the callout server to avoid
returning an entire message always. The server could, for example,
return a status code indicating that it does not want to alter the
original message. Keeping a copy of the message at the intermediary
can significantly decrease the amount of data that has to be
transferred between intermediary and callout server. However, it
requires the intermediary to store and manage all messages it has
sent to the callout server. Thus, it introduces complexity in the
intermediary and increases its memory requirements.
To alleviate this problem, the intermediary could specify the amount
of data it is willing to buffer for one request. If this limit is
reached, the intermediary will stop the transmission of the request
and will wait for a response. Up to that point, the server is
allowed to respond at any time and assume that the intermediary has
kept the entire message. If the server is not able to determine a
response from the initial part of the request, then it MUST
explicitly request the transmission of the remaining part of the
request. The next response MUST assume that the intermediary does
not have a copy of the message.
3.2.4 Preview
In some cases, the remote callout service can complete its operation
before it has received the entire message. For example, a virus
checking service can certify a large fraction of all files as
"clean" just by looking at the file type and the first 2K bytes.
Another example is a content filtering system that marks a web page
as containing "illegal content" as soon as certain words appear in
that page. In these cases, the remote callout server does not need
to receive the remaining part of the message and can instantly
respond with a certain status code. A callout protocol SHOULD
provide the possibility for a server to opt out of a transmission
early.
Dracinschi, et. al. Expires MAY 2002 [Page 6]
Internet Draft Callout Protocol Requirements November 2001
Basically, there are a two of design alternatives for the preview
functionality: In the first approach, the intermediary sends a pre-
defined portion of the request to the callout server, then stops and
waits for a response from the callout server. If the server returns
a positive response, the intermediary sends the remaining part of
the message. Otherwise it interrupts the transmission. This approach
is used by the ICAP protocol. In the second approach, the callout
server is allowed to respond to a request at any time. It MUST
indicate in this response if the current transmission SHOULD be
completed or interrupted.
A prerequisite for the first approach is that the intermediary knows
the amount of data required by the server to decide on continuing or
interrupting a request. In these cases the intermediary can send
exactly this portion of a request and thus minimize the amount of
data that is exchanged. A drawback of this approach is that the
handshake between intermediary and callout server introduces an
additional delay into the processing of one request. The major
advantage of the second approach is that it lets the server decide
at which point the transmission is interrupted. This can be
exploited, for example, by services that make their decision on
continuing or interrupting dynamically during the processing of one
request. In these cases, the second approach is more efficient,
since it allows the server to opt out of the transmission as soon as
possible. Summing up, in the ideal case the first approach is used
if the size of the preview is known in advance and the second
approach is used otherwise.
If only one approach SHOULD be supported by a callout protocol, the
penalty for not using the optimal approach MUST be considered. If
the second approach is used in any case, the intermediary continues
sending data after the decision point until it receives a response
from the server. If the response is to continue the transmission, no
bandwidth has been wasted and, in addition, no delay for the
handshake has been introduced. If the response is negative, the
intermediary has sent redundant data for the time of one message
round trip. If the first approach is used in any case, the
intermediary MUST guess the size of the preview. If the chosen size
is too large and the server decides to bail out of a transmission,
the penalty is the data that is transmitted until the full preview
size is reached. If the guess of the preview size was too small, the
intermediary MUST continue and send the entire message. Thus, the
penalty is the part of the message after the actual decision point.
In conclusion, the penalty using the first approach in any case is
typically higher than the penalty of always using the second
approach.
3.2.5 Partial content
Some remote callout services only modify small parts of the original
message. For example, a translation service typically inserts a
small icon into the original page, from which the translated page
can be reached. Another example is a service that forces all
cacheable data to expire at a certain time by modifying the HTTP
Dracinschi, et. al. Expires MAY 2002 [Page 7]
Internet Draft Callout Protocol Requirements November 2001
header fields. In these cases, returning the entire message from the
callout server back to the intermediary would not be very efficient.
Instead, a remote callout server could just return the modified
parts of a message and indicate the position at which this part MUST
be inserted into the original message.
This is much like a partial content response of HTTP. It is
important to keep the burden on the intermediary as low as possible.
For this reason, the response SHOULD always indicate the offset of
the partial response in absolute byte numbers. Basically this
approach trades an increase of complexity in the callout protocol
and the intermediary against a decrease in the amount of data that
has to be transmitted. Although the additional complexity seems to
be relatively low, the benefits heavily depend on the remote callout
services that are able to utilize this feature.
3.2.6 Multiple services on the same message
A remote callout service provider might offer several callout
services. In this case, it might not be reasonable to make a
separate call for each remote service to be executed on the same
content-path message. Instead, it would be more efficient to
transfer the content-path message to the remote callout server once,
execute all services and return the entire response. The callout
server is responsible for dispatching the message in the correct
order to the different services and for aggregating the responses
into a single response message.
To invoke multiple services, an intermediary MUST be able to specify
more than one URL. The design alternatives are to set up one channel
for each combination of remote services or to use one channel to a
callout server and specify the desired URLs in each message.
The most challenging task is to dispatch the requests to multiple
services and to aggregate the responses of individual services. This
SHOULD be done by a dispatcher on the remote server. Thereby, the
following rules can be considered:
Caching: the response MUST contain the earliest expiration date.
Keeping copy: the remote callout server SHOULD propose the maximum
of the prefix sizes of individual services as the prefix size of the
compound service.
If a service requests the transmission of the entire message, the
server MUST return this request to the intermediary and forward the
remaining message to the service. This request frees the
intermediary from the burden of keeping a copy of the message. If
the server itself is not willing to buffer the message, it MUST call
all subsequent services with preview size zero. In any case, the
server MUST return an entire message to the intermediary.
Preview: if the response of a service indicates that no changes are
required, the service dispatcher SHOULD NOT opt out of the current
transmission of the request. Instead, it SHOULD forward the current
message to the next service. Only if all services indicate that no
Dracinschi, et. al. Expires MAY 2002 [Page 8]
Internet Draft Callout Protocol Requirements November 2001
changes are required and the message still has not been transmitted
completely, the service dispatcher MAY interrupt this transmission
and return a "no changes required" response.
Partial content: the message dispatcher of the callout server MUST
insert the partial response it receives from each service into the
full message before sending it to the next service. If all services
have returned partial responses, it MAY decide to aggregate all
parts and return as a partial response to the intermediary.
Otherwise it returns the response it got from the last service
called as an entire message.
4 Security Considerations
This document does not explicitly require a callout protocol to
encrypt the encapsulated content-path messages for transit by
default. In the absence of some other form of encryption at the link
or network layers, eavesdroppers may be able to record the
unencrypted transactions between the intermediary and the callout
server.
5 Acknowledgments
The authors would like to thank all active participants in the OPES
mailing list for their thought-provoking discussion. In particular,
we want to acknowledge major contributions from Andre Beck, who was
heavily involved in shaping this document.
6 References
[1] S. Bradner. RFC 2119. "Key words for use in RFCs to Indicate
Requirement Levels", March 1997
[2] Tomlinson, G., et al. "A Model for Open Pluggable Edge
Services", Work in Progress, Internet Draft draft-tomlinson-
opes-model-00.txt, July 2001.
7 Author's Addresses
Anca Dracinschi Sailer
Room 4F-531
Lucent Technologies
101 Crawfords Corner Rd.
Holmdel, NJ 07733
Phone: (732) 494-2259
Email: anca(_at_)bell-labs(_dot_)com
Volker Hilt
Praktische Informatik IV
University of Mannheim
Dracinschi, et. al. Expires MAY 2002 [Page 9]
Internet Draft Callout Protocol Requirements November 2001
Phone: +49 621 181 2606
Email: hilt(_at_)informatik(_dot_)uni-mannheim(_dot_)de
Markus Hofmann
Room 4F-513
Lucent Technologies
101 Crawfords Corner Rd.
Holmdel, NJ 07733
Phone: (732) 332-5983
Email: hofmann(_at_)bell-labs(_dot_)com
Rama R. Menon
Intel Corporation
M/S JF3-206
2111 NE 25th Ave.
Hillsboro, OR 97124
Phone: +1-503-712-1438
Email: rama(_dot_)r(_dot_)menon(_at_)intel(_dot_)com
Full Copyright Statement
Copyright (C) The Internet Society (2000). All Rights Reserved.
This document and translations of it MAY be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation MAY be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph
are included on all such copies and derivative works. However, this
document itself MAY not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process MUST be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Dracinschi, et. al. Expires MAY 2002 [Page 10]