On 8/6/2010 3:14 PM, Hoskins & Gretton wrote:
HI, I have to convert some Katakana strings from "original" to "new"
by adding ー (#x30fc;) a pronunciation character (see
http://www.fileformat.info/info/unicode/char/30fc/index.htm).
In Japanese, there aren't any word boundaries, so essentially all of
my search strings are substrings of the text of the current element.
When substring "a" is followed by the character ー I do not want
to make the replacement.
example:        ブラウザ is a search string
but it is followed by ー already -- do nothing
When substring "a" is not followed by the character ー I want to
make the replacement to create "a" followed by ー.
example:        ブラウザ is a search string
but it is not followed by #x30fc; already
                add to the end to make it
                ブラウザー
If I was going to just add the ー, I was able to do that with a
regex that contained the strings that I wanted to find by using regex
and analyze-string, where $regexSearch contains all of my search
Katakana strings:
                <xsl:analyze-string select="." regex="({$regexSearch})">
                    <xsl:matching-substring>
                        <xsl:value-of select="regex-group(1)"/>
                        <xsl:text>ー</xsl:text>
                    </xsl:matching-substring>
                    <xsl:non-matching-substring>
                        <xsl:value-of select="."/>
                    </xsl:non-matching-substring>
                </xsl:analyze-string>
However,I can't figure out how I should fit this in to an overall
xslt, where I need to check check ahead in the element text before I
decide to make the substitution. Currently, if there is a
string:                ブラウザー
it becomes:     ブラウザーー
(doubling the last character).
If someone has some experience with this type of search and replace
problem, I would appreciate some guidance.
Regards, Dorothy
--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
How about
   select="replace(., 'ザ([^ー])', 'ザー$1')"
?
And if that fails to catch ザ when it occurs at the end of a text
node, wrap the result in
    replace(., 'ザ$', 'ザー')
HTH,
Lars
--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--