ietf-openproxy
[Top] [All Lists]

Re: pulish WCIP version 01

2001-03-02 13:55:01
Hi, Fred, this is excellent. Attached is the updated doc, also linked at http://content-signaling.org/ so people can always see the up-to-date one. (yet an invalidation problem ;-)

Comments (revision log) inline.

At 11:14 AM 3/2/01 -0500, Fred Douglis wrote:
I have a problem with the term "dynamic data".  To me, dynamic data is
something that changes all the time, such as a stock quote, and is
inherently uncachable.  "Frequently-changing" data is more
appropriate, as long as the rate of access dominates the rate of
change -- the observation from the DOCP work among others.

So, I would search and destroy references to "caching dynamic data"
and make it clear that you mean "frequently-changing" or maybe
"semi-dynamic" data.  Alternatively, one might redefine "dynamic data"
to be absolutely clear about this distinction, something I've tried to
do below, both when the term is introduced and in the Defs section.

I know. "dynamic content/data" is a fuzzy term. Outside of the scope of WCIP, I view this term as referring to:

1. frequently changing content (and yet popular)
2. personalized content
3. dynamically computed content (computed based on frequently changing data and/or personal info).

WCIP is solving #1 with the trajectory that if the proxy computes dynamic content, WCIP can be used to keep the underlying data up-to-date, blah blah ...

Anyways. Back to the scope of the draft, we should probably keep things simple and non-controversial. So, I have replaced this term with frequently changing content.

You changed reliable multicast to IP multicast and claimed that
reliable delivery wasn't necessary because of the volume IDs, but in
3.3 it still says message delivery MUST be reliable.  Should that be
changed?

changed the bullet to "Delivery: delivery SHOULD be real-time in that the average latency should be comparable to the network round-trip time from the sender to the receiver. It's RECOMMENDED that the delivery be reliable, full duplex, and in sequence (wrt. the sender) to achieve good performance, although it's not required."

Related work seems incomplete. I know not everything need be included, but for
example, it's incestuous to include [4] and [5] but not earlier work on using
volumes for invalidation -- my own incestuous suggestion there is:

done.

@InProceedings{cohen98,
  author =       "Edith Cohen and Balachander Krishnamurthy and Jennifer
                 Rexford",
  title =        "Improving End-to-End Performance of the {W}eb Using
                 Server Volumes and Proxy Filters",
  booktitle =    "Proceedings of the ACM SIGCOMM conference",
  year =         "1998",
  month =        sep,
  pages =        "241--253",
  note =         "\url{http://www.research.att.com/~bala/
                 papers/sigcomm98.ps.gz}",
}

I also think the draft uses the first person too much -- lots of
"we's" in there, or "let's lay out", or ...

fixed.

    Abstract

done.

    Table of Content

    1. Introduction

got rid of dynamic content. Only use "frequently changing content" and "dynamically computed content".

       Dynamic content is quickly becoming a significant percentage of the

strike "the" at end

sorry, what do you mean? delete "the"?

TTL

fixed

       strong cache consistency, and yet "poll every time" is costly. So

cite Gwertzmann & Seltzer here?

done.

       the content provider usually sets a very short expiration time or

a content provider

done.

 s/the/a/g

tried to nail out some out as much as I can.

The two modes are merely the two extremes of a continuum,
characterized by how soon the server proactively sends
updates/heartbeats and how soon the proxy revalidates the volume.  The
sooner the revalidation, the quicker the objects are invalidated; this
results in better consistency but also more load on the server and
proxy.  Regardless of the mode, the same messages are exchanged
between the invalidation server and the caching proxies, whose format
is defined by an "ObjectVolume" XML DTD [forward ref]. Each round of
message exchange, whether initiated by the server or the client, is a
process of "volume synchronization" and results in an up-to-date view
of the object volume. Based on the up-to-date view, the proxy can
provide freshness guarantees to all the objects in the volume.

done. thanks!

WCIP-related

fixed.

Dynamic Content

Web resources that change "frequently," where the definition of
"frequent" depends on the access rate and desired consistency
guarantees.  [Or something to this effect...]

deleted refrerences to dynamic content. Only use "frequently changing content" and "dynamically computed content".

Strike "besides"

deleted "besides".

capitalize "invalidation"

fixed.

       Revalidation Interval

            A property of the client-driven mode. The invalidation client
       initiates volume synchronization with the invalidation server, when
       the "last synchronization time" was "revalidation interval" ago. The
       interval SHOULD be smaller than the freshness guarantees of all the
       objects in the object volume, to avoid unnecessary cache misses.

smaller, or no greater than?

If network propagation delay is 0, it's "no greater than". Otherwise, the proxy can use some spare room.

could -> can
in a timely fashion
been able

fixed them.

the object right away as HTTP revalidation could result in an
indication that the object is "Not Modified".

done.

       The invalidation server picks the heartbeat interval while the
       invalidation client picks the revalidation interval. Both of them
       SHOULD be smaller than any of the freshness guarantees of the
no larger?

better smaller to leave room for server and network delay.

"one can"
for the DTD
An invalidation
start to send

fixed.

       (4)  Reliability: message delivery MUST be reliable, full duplex,
            and in sequence (wrt. the sender). Moreover, delivery SHOULD be
            real-time in that the average latency should be comparable to
            the network round-trip time from the sender to the
            receiver.

Still true given the multicast change?

Nop, changed to "Delivery: delivery SHOULD be real-time in that the average latency should be comparable to the network round-trip time from the sender to the receiver. It's RECOMMENDED that the delivery be reliable, full duplex, and in sequence (wrt. the sender) to achieve good performance, although it's not required".

later in

fixed.

       The channel relay point may have multiple clients subscribed to the
       same invalidation channel. It in turn only subscribes once to the
       original invalidation server. By multiplicatively relaying channel

multiplicatively?

Why not "hierarchically"?

It refers to the multicast-ish nature of the relay, which takes in one stream and sends out multiple. I'll change it. It's the second time someone complained about it. ;-(

helps to scale
vice-versa
constructs
cite delta-encoding
an addition
of the event.
polling frequency
There is software [cite]

fixed.

       some cases, an event described above may invalidate multiple URLs.
       If the participating caching proxies are able to interpret such
       events, the invalidation message may carry the description of the
       event, instead of the list of invalidated URLs. This may be future
       work.

This paragraph made no sense to me.  I think what you mean is that
WCIP can be used to tell systems like AIDE (my own), URL-minder,
etc. about changes, and I fully agree -- and probably suggested this
in the first place.  But the second sentence about invalidation is a
non-sequitur, and I don't understand the next sentence.  How about:

If a database event triggers the invalidation of hundreds of objects, instead of listing all those objects and sending over to proxies, the server may just describe the event itself to the proxies, provided that the proxies know how to interpret the event and figure out the hundred objects on their own.

Does this clear things? maybe the above text should be put in as an example.

There is software providing user-level notification of changes to web
content [cite].  WCIP could potentially be used to permit agents to
subscribe to change notification, not for the purpose of cache
invalidation, but to notify users.  Integrating such functionality may
be future work.

added. also, "E.g., a web crawler could subscribe to WCIP channels instead of crawling web sites periodically for object updates."

    4.3 Discover Channels

    ...
       Example:

            Invalidated-By: wcip://www.cdn.com:777/allpolitics?proto=http

This used to be "cnn.com" and got changed to "cdn.com" yet later
references say cnn, and "allpolitics" seems specific to CNN.  Are you
sure about this change?

Nop, don't know how it got changed. changed it back.

meantime
the latest
and compares

fixed.

       However, if the volume indeed has changed, the invalidation server
       MUST send back an ObjectVolume description with a base equal to or
       smaller than 7. Here is an example:

I'm not that into IETF lingo, but I thought that a specific example
such as this wouldn't justify MUST rather than simply "must".
Thoughts?

does this help: "However, if the volume indeed has changed, the invalidation server sends back the journal of changes since version 7. The reply MUST have a base version equal to or smaller than the version in the synchronization request."

too colloquial -- there are
skew
a URI
a filename
Why not just say SSL; isn't HTTPS redundant?

fixed.

       9  Mogul, J.C.; Douglis, F.; Feldmann, A.; Krishnamurthy, B.,
          "Potential benefits of delta encoding and data compression for
          HTTP", ACM SIGCOMM 97 Conference.
       10  Mogul, J.C.; Douglis, F.; Feldmann, A.; Krishnamurthy, B.,
          "Potential benefits of delta encoding and data compression for
          HTTP", ACM SIGCOMM 97 Conference.

Notice anything odd here?

nice catch. this is the result of last-minute changes.

    Full Copyright Statement

Truncated?

woops, fixed.

Thanks!
Dan

<Prev in Thread] Current Thread [Next in Thread>