ietf-openproxy
[Top] [All Lists]

Re: An opes usage question.

2004-03-03 11:08:16
Alex, thanks for the discussion. I'll take your suggestion to provide a more detailed example that we can discuss. I am considering the case of a large number of users accessing content and also needing adaptations for the proper delivery of the content. Also the adaptations for the services may be done in stages. The environment (summarized from our previous e-mails, ...but please let me know if I have a misunderstanding) that I envision is:

1) The opes framework allows services to be distributed (or pipelined), with incremental services being added to each traffic flow at each stage. This is an opes proxy to proxy communication model. 2) A pool of multiple opes proxies can be provisioned at each stage to support a large number of flows 3) Installing load balancers between stages to distribute the flows is ok in an opes framework (and this is a typical business scenario). If all the flow processing can be achieved in-line then there is no need to identify any specific proxy in any pool. In this case we probably don't care which previous stage opes proxy did the prior adaptation step. 4) The crux of the problem is how to share information between two stages of the flow. This sending of metadata from one stage to a previous stage will require knowledge of specific server addresses (a more general case might be be to send the metadata in either direction).

Let's assume, for a specific example, a network allows access to premium content that is billable by the number of bytes being delivered. The content flows to the users from a premium content server through a pool of billing gateway proxies and then proceeds to the next stage where the content may be modified by other proxies for rendering on a user's device (color to black/white, compressed for efficiency, adding advertisements or banners etc.). If any particular flow adaptation reduces the number of bytes then billing needs to be adjusted. Now two specific proxies in each pool need to exchange metadata to coordinate information about a specific flow to get the bill correct for a particular flow. The location of a load balancer between the two sets of proxy pools hides the association for individual flows. I "think" this may be a different problem than the general peer discovery problem you brought up. In this case peer discovery refers to "flow participant discovery" in the previous stage and does not refer to locating useful servers. The need for discovering peer participants for a particular flow arises if any specific metadata exchange has to be made between participants, like the billing record consolidation example.

I think "flow participant discovery" problem I described is an opes specific problem. So I am back to my initial hypothesis that the opes framework needs a "flow participant discovery" protocol between proxies that isn't disturbed/effected by load balancing or uses load balancing to some advantage (assumes it is there and leverages it somehow). I have some thoughts on that, but first I like to make sure the usage situation is clear. Your thoughts are greatly appreciated. Thanks again.
Regards  John


Alex Rousskov wrote:

On Fri, 27 Feb 2004, John G. Waclawsky wrote:

Alex, you had mentioned in a previous e-mail that the "framework is
not complete. It does not have ready-to-use tools for all possible
intermediary adaptations". Would this include a load balancing
environment?

Possibly, but it is too early to say. I do not know if any
opes-specific tools will be needed in a load balancing environment.
For example, as far as I can see, to load balance callout server
adaptations, no new tools are needed and OCP can be used as is.

Specifically, load balancing is not the problem that I currently
see. I believe I can make the load balancers work with what I
anticipate doing within an opes framework.

Looks like we are in agreement here.

But, I am looking for a general opes solution that allows peer
discovery so I can perform adaptations (pipelined over load
balancers, if you will) on a number of flows, web pages, file
downloads,...etc.

By "peer discovery", do you mean discovering callout servers that
provide OPES services that the application proxy needs? If yes, then
this discovery is out of OCP and possibly even OPES scope. It can be
implemented using existing or new protocols on top of OPES protocols.

The closest thing to peer discovery supported in OCP is querying an
OCP peer for a list of supported services or negotiating service
support. Both can only be done if we have established a connection to
the peer, which is what peer discovery should help us with.

I consider load balancing as an essential part of a "typical" high
capacity services situation. My thoughts are that peer discovery is
necessary as an option for a load balanced opes environment.

I see no direct relationship between load balancing and peer
discovery. I define load balancing as spreading the load among a known
set of servers. I define peer discovery as locating potentially useful
servers. The two can be combined (discover what servers we can use for
load balancing), but are independent.

Are you saying in your reply that OCP can also be used for peer
discovery?

No.

If not, shouldn't the opes framework be extended to satisfy a peer
discovery use case over a load balancer?

Only if there is something really OPES-specific in this peer discovery
problem. As you know, peer discovery in general is an old problem with
a few semi-working solutions and no good ones. What OPES-specific peer
discovery features would you like the framework extension to support?
And, again, I am not sure how you connect peer discovery and load
balancing.

I was thinking another protocol between peers might be needed (or we
can use an existing protocol).

Hopefully, that another protocol already exists and can be used as-is.
Can you give a very specific example: who needs to discover what and
how is that related to load balancing?

Thanks,

Alex.



Alex Rousskov wrote:

John,

        OPES protocols do not have features specific to a
load-balancing environment, and I do not think they have any features
that do not work in such an environment. If your load balancer can
balance opaque TCP sessions, then OPES callout protocol should work
just fine with multiple callout servers. If you are proxying native
HTTP without outsourcing services via OCP, then load balancers can
balance HTTP "sessions" as well.

        OCP is easier to load balance than HTTP because it does not
have valuable state outside of the OCP connection. While HTTP claims
to be stateless, things built on top of HTTP often require
client/server affinity. Hopefully, OCP will not be abused in such a
way because it has mechanisms to maintain necessary state within the
connection.

HTH,

Alex.

On Fri, 27 Feb 2004, John G. Waclawsky wrote:



I have another question. Does an opes environment handle the
requirement of a peer interacting with another peer with a load
balancer between them. The situation is that I require two proxies
to provide a service.  An "A" proxy and a "B" proxy that are peers
in providing the service to a data flow. Because of the scale, the
likely deployment will involve multiple boxes of types A and B. This
means I will have two groups of boxes with a load balancer between
them. I wish to dynamically associate an specific box of type A with
a specific Box of type B (across the load balancer) and a particular
flow. Is it possible to perform peer discovery and the dynamic
association between the two peers and a flow within the opes
framework? Thanks for any help and guidance.

Regards  John

John G. Waclawsky wrote:



Alex, I very much appreciate your help and the additional
information. I need a little time to digest this and think some more
about my problem and opes "needs"....
Thanks.   Regards  John

Alex Rousskov wrote:



On Thu, 8 Jan 2004, John G. Waclawsky wrote:





Alex, thank you for your reply and the pointer. I agree with you
that the use case I am interested in is not explicitly documented in
the opes material. This gives the wrong impression, since all the
examples show the data flows returning to the first hop data
dispatcher. To restrict data flow in this way would be inefficient
for high capacity applications and would tend to make the first hop
a bottleneck.




John,

        You are misreading OPES architecture diagrams because you are
thinking about a much "simpler" case than most diagrams attempt to
document and, hence, you assign wrong roles to "boxes" that may not
even exist in your environment. This is the architecture draft fault,
not yours. Let me try to explain.

Here is a "classic" OPES proxying case (horizontal lines are
application data/protocols being proxied, proxies may be adapting data
internally):

 Figure A.

     end -- proxy -- proxy -- ... -- end

Here is a "classic" OPES callout case (vertical lines are callout
data/protocols, proxy is not adapting data, callout servers may be
adapting data):

 Figure B.

     end --- proxy --- end
      _________|__________
      |        |         |
      |        |         |
   callout   callout   callout
   server    server    server

A combination of the above is possible and common, of course. In a
combined case, some proxies are adapting data internally and some use
callout servers to adapt.

In your particular case (the adapted data flows to the next hop), you
do not have callout servers. You have proxies that adapt data
internally. However, your case may not be a classic proxying case
because you may want proxies to use a protocol that differs from the
original application protocol (so that you can ship metadata and
perhaps pipeline more efficiently). I will use curly (~) lines to show
that new protocol below. You may also want to do some load balancing:

 Figure C.

                 ~~~~~ proxy1 ---
                /
   end -- proxy ~~~~~~ proxy2 ---  ??  end
                \
                 ~~~~~ proxyN ---


I do not know how the proxies will get application data to the right
"end", but it is not important for this discussion.

Note that the curly path may use a completely new protocol (e.g., OCP)
or can use a combination of the original application protocol to
deliver data and some other protocol for metadata (billing, etc.)
records.

The above is OPES, but not a callout case. There are no callout
servers, just proxies. Whether you want to use OCP for the curly path
depends on what kind of data/metadata you want to exchange and what
other protocols you have available.





To be a little more specific on my use case and to make sure I
understand your response let me add more detail. Fundamentally, I wish
to have an "open" IETF network environment. The second requirement is
speed in adaptation processing, so I am thinking that callout will be
faster than using a proxy (maybe this is an incorrect assumption).




It's an incorrect question :-). Adaptations at the callout server can
be as "fast" or as "slow" as adaptations performed internally at the
proxy. The primary reason to use a callout server is to "outsource"
(from business, development, support, logistics, legal, etc. points of
view) adaptations. The primary reasons not to use a callout server are
security and overheads of the data having to return back to the proxy.





I may be a little confused about the distinction between classic
proxy and classic call out in this situation (this word proxy isn't
even mentioned in the opes architecture document as a
consideration).




I hope the above diagrams clarify the distinction. OPES processor is
the term closest to the "proxy", I think.





Also, I have a requirement to adapt a large volume of content for
numerous devices.




Both callout and builtin adaptations can handle large volumes of
content. There are many factors at play here, from legal concerns to
the kind of adaptation being performed.





I expect that the volume of data will require a number of adaptation
(call out) processors. I estimate 10 of them and therefore I would
need load balancing as part of the solution for directing data to
the call out servers (doing the adaptation).




Load balancing can be done with both callout and proxying schemes.
OPES mechanisms are usually per-message so any load balancing method
that does not split individual application messages should work just
fine.





I am looking at Figure 3 in the opes architecture draft and
considering a load balanced collection of ten call out servers. As I
had mentioned previously, I'd like to have parallel pipeline
approach where the adapted data from each of the opes call out
server is simply forwarded on, directly to the producer or data
consumer (what ever the case would be).




The latter makes the adapter a proxy, not a callout server (by
definition). See Figure C above.





In addition each of the opes call out servers would then send
billing and trace data back to the load balancer (or another network
location). Is it realistic to expect that this could this be
accomplished within the opes framework?




It is. The only question is what protocol(s) you will use between your
proxies (the curly lines in Figure C).





In fact a more fundamental question is the opes framework the best
way to solve this problem and maintain an open system. I think this
is what opes is all about.




IMO, OPES framework accommodates, in principle, any intermediary
adaptation. Thus, your problem is within the framework. However, the
framework is not complete. It does not have ready-to-use tools for all
possible intermediary adaptations. I hope that you will find OCP Core
and OPES communications "tools" useful for your problems, but I lack
information to tell your whether they are the best.

The system is "open" if it uses "open standards". If current OPES
tools are not right for you, and there are no better open tools, then
we can work together, within the OPES Framework, on defining open
standards (protocols and interfaces) that will solve your specific
problem.

Alex.







<Prev in Thread] Current Thread [Next in Thread>