ietf-openproxy
[Top] [All Lists]

Re: comments on draft-beck-opes-irml-02

2002-03-07 14:16:03

Hi Mark,

see my comments inline.

Mark Nottingham wrote:
* Abstract / Section 2 - IRML is introduced as being targeted at
Web Services. Although it's fairly new, WS is widely held to be the
set of technologies being standardised in the W3C (e.g., SOAP,
WSDL, etc.). Are IRML's Web Services the same ones?

SOAP is architected so that all directives to intermediaries are
contained in the message (using Header blocks). While it's possible
to control the behaviour of intermediaries using out-of-band
techniques (like rulesets), this is generally unexplored territory
for SOAP people.

I'd suggest rewriting the abstract & introduction to de-emphasise
WS; really, they're just an application that can be used with OPES,
rather than the driving factor (yes, WS is about distributed
applications, etc., but it's a term that's used in a very vague
sense, and mixing it with OPES at this level only muddies the
waters). If WS is important to OPES, it would be good to
(somewhere) reference the WS work, and to explain the relationship
between OPES rulesets and SOAP Headers (I can help come up with
some prose here if it would help).

To my knowledge, the relationship between Web Services and OPES has not
really been discussed yet on this list, but I think it would be worth
while to explore this further. The point we wanted to make in the IRML
introduction was rather that IRML rule sets may also be useful in
non-OPES environments, but I agree that this section probably needs a
rewrite.

* Section 3 - Is there any reason a public identifier was chosen
vs. a system identifer? Also, if namespaces are used, they should
always be used, whether the document is a whole ruleset or a
fragment.

Agreed. As for the document identifier: Are there any guidelines as to
when to use a public identifier vs. only a system identifier?

* Section 3.1 - 'The conditions within a rule refer to message
properties in the request or response message of a given content
transaction.' -> 'The conditions of a rule can refer to the
properties of messages passing through an OPES intermediary.' (so
it's not tied to request/response).

Good idea. The entire document is based on the assumption that a message
transaction consists of a request and a response message, but we
probably shouldn't make that assumption because other protocols may have
different message exchange patterns, e.g. in the case of SIP you can
have a transaction with one request and multiple responses.

Is the authorized-by element intended to capture the authorisation
state of the ruleset (e.g., the rule processor appends it to the
ruleset upon discovering and verifying the identity and rights of
the submitting process), or is it a placeholder for a mechanism
like XML Digital Signatures?

It's really just a mechanism to allow for the delegation of 
authority, e.g. to allow end users to authorize their service
providers to set up IRML rule sets on their behalf which would then be
reflected in the 'authorized-by' element. IRML does not specify how the
authorization state of a rule set is enforced/verified, but this would
probably be a work item for OPES.

* Section 3.4.3 - perhaps it would be better to use a URI instead
of an acronym? Also, is it always the case that a group of rules
will apply to exactly one protocol? I'd argue that it would be
better to associate a protocol with an individual rule.

Yep, we had that discussion already a while ago. Another proposal would
be to stick rule elements inside a 'protocol' element and allow multiple
'protocol' elements per rule set. That way we could group rules by
protocol. 

* Section 3.5.1 - processing points - this seems unneccesarily
restrictive; 1|2|3|4 is based on a request/response pattern, and
even then it's very possible that more processing points will be
desireable.

Instead, why not, after identifying the protocol by a URI, identify
the protocol-scoped message as a URI as well? Then, you can
identify message-specific processing points with (drum roll) a URI.

For example;

<rule protocol="http://ietf-opes.org/protocol/rfc/2616";
      message="http://ietf-opes.org/protocol/rfc/2616/Request";
      processing-point="http://ietf-opes.org/protocol/rfc/2616/Request/Client";


This doesn't really help to make the rules more compact and readable
:-), but I can see some benefit in it. Other opinions?

Would there be any case where specifying a port is necessary?

* Section 3.6.1 - just curious, why 'property' instead of
'condition' (which is what it's refered to throughout the rest of
the text?)

Conceptually it's a rule condition, but the element and its attributes
really refers to a (message|system|service) property, for example the
'name' attribute specifies the name of a property, not the name of the
condition.

re: context; same arguments as above re: URI. Rather than making
things like request-URI, response status, protocol version, etc.
protocol-specific system variables, why not make them
message-specific context? For that matter, I wonder if it would be
good to make everything matchable (including protocol, etc.); this
seems better than building in priviledged conditions, considering
that OPES is supposed to be a generic framework.

re: matches; regex is awfully expensive (and encourages people to
cut corners in parsing). Could some room for extensibility be left
here, so that other means of matching can be specified? I suppose
that regex could be the default, and other means of matching could
be specified by an attribute (there needs to be room for
expressions here anyway; it would be good to be able to do
"response.headers.age < 30", for example).

This has been proposed before. The issue here really is how much
complexity we can afford to add to a OPES rule language. It's basically
a tradeoff between the performance impact on the in-path OPES
intermediary and the goal to minimize the number of unnecessary
callouts/invocations of OPES services. 
 
Match seems to have the beginnings of boolean logic embedded; I see
AND (multiple matches specified) and NOT (not-matches) and even OR
(just flattened out to multiple rules). IMHO It would be good if
this were explicit; e.g., <and> ... </and> <or> ... </or> <not> ...
</not> wrappers around matches.

We tried that when we first came up with IRML, but it get's really messy
if you have a lot of conditions. 

* Section 3.7.4 - This looks good, but the syntax is a but
cluttered; there's the potential for a lot of rules, so we want
them to be as compact and readable as possible.

I'd suggest
   - identifying 'any' as a unique URI (e.g.,
     http://ietf-opes.org/irml/service/any) so that it can be
     collapsed into 'uri'
   - making 'uri' an attribute of 'service' and using a more descriptive
     name (perhaps 'type', moving the current semantics of type to
     something more descriptive like 'fail_target')
   - if parameters are desired, they can be unqualified child
     elements of service
   - if dynamic values are desired in parameters, call them out with
     a mechanism like XSLT's value-of

This gives you:
  <irml:service name="Foo" type="http://..."; fail="..."/>
or, if you want parameters,
  <irml:service name="Bar" type="http://..."; fail="...">
    <param1>a</param1>
    <param2><irml:value-of var="whatever"/></param2>
  <irml:service>
which is much easier to read if there are multiple services or
parameters.

Good suggestion. 

An issue regarding specification of failure is idempotency; there
may be situations where failing over to another service may cause a
request to be repeated. In HTTP, idempotency is theoretically easy
to determine (except for cases where people use GET with side
effects, but that IMHO isn't our problem). Other protocols may not
make it so easy.

Agreed. But I don't see this to be a problem that needs to be addressed
by IRML, but rather by the general OPES architecture.
 
I don't see anything about differentiating between local and remote
services. How does that happen (or does it in the rule language)?

IMO this is out of scope for IRML. IRML rules identify a service, but
it's up to the intermediary to discover, choose and execute an instance
of the identified service.

-Andre