xsl-list
[Top] [All Lists]

Re: [xsl] Implementation Advice: Grouping Strings by Character Range in XSLT 2

2016-04-29 10:43:14
Cool. My initial implementation attempt uses analyze-string just as you
show and seems to work as I wanted.

Cheers,

E.
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com




On 4/29/16, 10:19 AM, "G. Ken Holman 
g(_dot_)ken(_dot_)holman(_at_)gmail(_dot_)com"
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:

I think any time going from a string "up" to rich markup (remember
the Omnimark triangle? Perhaps they used the triangle from someone
else) I would use analyze-string.

And I think it would be the easiest to synthesize as well, something
along the lines of:

    regex="([cde]+)|([g]+)"

... then using regex-group(n) for each range.

One would have to use tail recursion for XSLT 1, but I don't think it
buys anything, plus your synthesis would be a lot more complicated
(yes, I know it is done only once).  Remember the XSLT processor is
optimizing analyze-string rather than any stylesheet expression of
the tail recursion.

. . . . . . Ken

At 2016-04-29 15:04 +0000, Eliot Kimber ekimber(_at_)contrext(_dot_)com wrote:
Using XSLT 2, I have a requirement to take text and group contiguous
sequences of characters in markup according to a given character range
the
characters are in. This is to support the application of range-specific
fonts to text in HTML.

I have a static definition of the character ranges for a given national
language and there shouldn't be any overlap between ranges. Given this
static definition, I'm generating XSLT code to operate on text nodes in
order to apply the range markup. The

For example, given the text string "abcdefg" where range "R1" is "cde"
and
R2 is "g", the marked up result should be: abc<span
class="R1">cde</span>f<span class="R2">g</span>

My initial approach is to generate a template that takes the current
language and the text node and then applies templates in a
language-specific mode.

For each language I'm then generating a template to do the range
matching.

My question, once I'm in a language-specific template for a text node,
what is the most efficient and/or easiest to code way to map the string
to
ranges? Since I'm generating the code it doesn't have to be concise.

I'm thinking along the lines of using analyze-string to match on any of
the groups and then within the matching-substring clause have a choice
group to determine which range actually matched. But it feels like I'm
missing a more elegant way to determine the actual range.

Or maybe there's a clearer/simpler/more efficient way using tail
recursion?

Thanks,

Eliot
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com




--
Check our site for free XML, XSLT, XSL-FO and UBL developer resources |
Streaming hands-on XSLT/XPath 2 training @US$45: http://goo.gl/Dd9qBK |
Crane Softwrights Ltd. _ _ _ _ _ _ http://www.CraneSoftwrights.com/s/ |
G Ken Holman _ _ _ _ _ _ _ _ _ _ 
mailto:gkholman(_at_)CraneSoftwrights(_dot_)com |
Google+ blog _ _ _ _ _ http://plus.google.com/+GKenHolman-Crane/posts |
Legal business disclaimers: _ _ http://www.CraneSoftwrights.com/legal |


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus


--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>