xsl-list
[Top] [All Lists]

Re: [xsl] Performance of predicate-based patterns

2015-02-06 10:50:05
Hi,

I'm afraid there may be times when processing HTML, for example, when
one might want to have match="*[@class/tokenize(.,'\s+
)='x'] and the like ... one might hope to be able to specify the
element type, but then there will be fallback cases such as

match="*[not(@class/tokenize(.,'\s+')=$knownclasses)]

where $knownclasses represents a controlled set of classes the XSLT is
expected to handle ...

Cheers, Wendell


On Wed, Feb 4, 2015 at 9:27 AM, Eliot Kimber ekimber(_at_)contrext(_dot_)com
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
The DITA for Publishers Word-to-DITA framework
(https://github.com/dita4publishers/org.dita4publishers.word2dita) has a
generic WordML-to-SimpleML transform
(https://github.com/dita4publishers/org.dita4publishers.word2dita/blob/mast
er/xsl/wordml2simple.xsl). This generates a simplified and generic form of
"word processing markup" from the WordML.

But because it's handling the whole WordML in a generic way it doesn't
have many templates that use predicates, so not sure it helps here.

The intermediate format puts the style name and ID on the element to which
it applies, so the the templates that process that file are either also
simple or are handled within gnarly for-each-group loops that infer
hierarchy from the flat paragraph structures.

This process is also driven by a separately-defined style-to-tag mapping
document, so there's little or no need for templates that match on
variable properties of the input elements.

Cheers,

Eliot
—————
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com




On 2/3/15, 5:34 PM, "Michael Kay mike(_at_)saxonica(_dot_)com"
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:


You will end up with similar match pattern if you try to map Word
styles (saved in WordprocessingML) into some XML structure. Style name
is stored in a subelement which is two levels down from actual
paragraph element. And a lot of publishing companies is processing
Word input documents. You will have templates like:

<xsl:template match="p[pPr/pStyle/@val = 'Heading 1']">
<h1>
<xsl:apply-templates/>
<h1>
</xsl:template>


We've got a precondition there that it will only match a <p> element, so
that's a good start. It then depends how many other rules there are that
also match <p> elements.

But yes, it would be good to look at some Word-ML stylesheets if anyone
knows of any. (I've come across a few over the years, but all specific to
a particular client.)

Michael Kay
Saxonica






-- 
Wendell Piez | http://www.wendellpiez.com
XML | XSLT | electronic publishing
Eat Your Vegetables
_____oo_________o_o___ooooo____ooooooo_^
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>