xsl-list
[Top] [All Lists]

Re: [xsl] Katakana substitution regex

2010-08-06 20:40:24
Try it in two steps. First create a variable:
<xsl:variablename="text-flow">
<wrapper>
<xsl:analyze-stringselect="."regex="{$table-title-uom-regex}"flags="s">
<xsl:matching-substring>
<match>
<xsl:value-ofselect="."/>
</match>
</xsl:matching-substring>
<xsl:non-matching-substring>
<non-match>
<xsl:value-ofselect="."/>
</non-match>
</xsl:non-matching-substring>
</xsl:analyze-string>
</wrapper>
</xsl:variable>Then run a for-each loop on all of the children of wrapper 
outputing out each item. If you have a match always output your x30fc. If you 
have match with content 30fc drop it as you have already put one out. Don't 
forget to sort the regex string on longest count so the longest string is 
tested 
first. 




________________________________
From: Hoskins & Gretton <hoskgret(_at_)rochester(_dot_)rr(_dot_)com>
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Sent: Fri, August 6, 2010 4:14:00 PM
Subject: [xsl] Katakana substitution regex

HI, I have to convert some Katakana strings from "original" to "new" by adding 
ー 

(#x30fc;) a pronunciation character (see 
http://www.fileformat.info/info/unicode/char/30fc/index.htm).
In Japanese, there aren't any word boundaries, so essentially all of my search 
strings are substrings of the text of the current element.
When substring "a" is followed by the character ー I do not want to make the 
replacement.

example:        ブラウザ is a search string but it is followed by ー already -- do 
nothing

When substring "a" is not followed by the character ー I want to make the 
replacement to create "a" followed by ー.

example:        ブラウザ is a search string but it is not followed by #x30fc; 
already
                add to the end to make it
                ブラウザー

If I was going to just add the ー, I was able to do that with a regex that 
contained the strings that I wanted to find by using regex and analyze-string, 
where $regexSearch contains all of my search Katakana strings:

                <xsl:analyze-string select="." regex="({$regexSearch})">
                    <xsl:matching-substring>
                        <xsl:value-of select="regex-group(1)"/>
                        <xsl:text>ー</xsl:text>
                    </xsl:matching-substring>
                    <xsl:non-matching-substring>
                        <xsl:value-of select="."/>
                    </xsl:non-matching-substring>
                </xsl:analyze-string>
However,I can't figure out how I should fit this in to an overall xslt, where I 
need to check check ahead in the element text before I decide to make the 
substitution. Currently, if there is a string:                ブラウザー
it becomes:    ブラウザーー (doubling the last character).

If someone has some experience with this type of search and replace problem, I 
would appreciate some guidance.
Regards, Dorothy 

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


      

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>