Re: Performance Question: Expensive Functions in Predicates

G. Ken Holman wrote:

I have an external entity with each identified applicability using ID:

<appls>
  <appl id="this" appl="yes"/>
  <appl id="that" appl="no"/>
  <appl id="other" appl="yes"/>
</appls>

that I pull into my instance.

I'm presuming that you have different entities for different sets ofconditions so that, for example, when you're processing for one targetsome things are "yes" and when you're processing for a different targetthose same things may be "no". If so, that implies that you have tochange the actual file the external entity resolves to based on thepresentation target. How do you do that?

For me this approach would not generalize well because the applicabilityof a given element may be determined by the processing-time value of apotentially unbounded set of conditions, including things like targetlanguage, target country, operating platform, page layout (forprint-specific renditions), rendition type (HTML, PDF, Web PDF, etc.),marketing region, product family (for content re-used across products orproduct families within a large corporation).

Another issue is that in my use cases the author knows what individualcondition values a given element applies to (this is for print and MacOS X help) but not whether or not a given element will be applicable*when processed* because in fact the set of possible conditions cannotbe known at authoring time and because the business rules fordetermining applicability might change even though the document contenthas not changed.

Therefore applicability must be determined dynamically at processingtime in my case.

However, having said all that, there is still a place for this type ofindirect applicability specification, namely, defining sets ofapplicability values that can then be used by reference. This eliminatesthe problem of having an ever-increasing and constantly-changing set ofapplicability attributes on elements.


For example, given your markup above, I might refine it to something like:

 <appls>
   <appl id="option-set-01">
     <lang>en-CA</lang>
     <presentationTarget>print</prenentationTarget>
     <platform>Mac</platform>
   </appl>
   <appl id="option-set-02">
     <lang>en-UK</lang>
     <presentationTarget>print</prenentationTarget>
     <platform>Windows</platform>
   </appl>
 </appls>

An authoring system can present either a list of option sets or theauthor can simply select the options and values and then the systemeither finds an existing set that matches or synthesizes one if itdoesn't exist.

The instance could still use a link to associate elements to theirapplicability values:


<thing1 appl_doc="my_applicability_sets.xml"
        appl_xpointer="xpointer(//appl[(_at_)id = 'option-set-01'])">...

<thing2 appl_doc="my_applicability_sets.xml"
        appl_xpointer="xpointer(//appl[(_at_)id = 'option-set-02'])">...

But the actual applicability would still be determined dynamically atrun time. For example, my is_applicable() utility function might looksomething like this:


<func:function name="util:is_applicable">
  <xsl:param name="current_node" select="."/>
  <xsl:variable name="condition_set"
       select="util:resolve_xpointer(@appl_doc, @appl_xpointer)"/>
  <func:result
       select="myjavaext:calc_applicability($condition_set[1])"/>
</func:function>

Three points about this approach:

1. I've used the XInclude pattern of having separate attributes for theURI of the target document and the XPointer of the target element(s). Ithink this is the best design pattern for doing addressing in an XML/URIcontext for the reasons stated in the XInclude spec.

2. I've put the applicability set into an external document since it'snot part of the core document content (many documents could share thesame option sets since the set of possible combinations is finite and,in practice will be relatively small, probably on a fraction of thetotal possible combinations) and is presumably managed by the authoringsupport system (which in Ken's case is Ken but in my case would besophisticated software components developed at great expense :-). Thisis also a reflection of my "external entities are bad pretend they don'texist" policy :-)

3. It uses a procedural language (e.g., Java) to do the actualapplicability calculation. This is because I've found it much easier toevaluate complex conditions in Java than XSLT, especially for people whodidn't grow up writing Scheme programs. For example, I found writing anXSLT function to correctly evaluate a statment like "print webpdfnot_help" almost impossible for my feeble brain but nearly trivial inJava. But maybe it's just me. Certainly the Java implementation is muchmuch simpler at the code level (it doesn't require recursion, for onething). In addition, I may need the same applicability calculationbusiness logic in other processing contexts, such as within an authoringtool or for content management import, export, or reporting, so it makessense to implement it generically.


Cheers,

Eliot
--
W. Eliot Kimber
Professional Services
Innodata Isogen
9030 Research Blvd, #410
Austin, TX 78758
(512) 372-8122

eliot(_at_)innodata-isogen(_dot_)com
www.innodata-isogen.com