xsl-list
[Top] [All Lists]

RE: [xsl] Multiple search and replace

2008-04-03 04:31:42
You want $1 rather than \1 to refer to group 1 in the replace string.

Michael Kay
http://www.saxonica.com/ 

-----Original Message-----
From: Pankaj Chaturvedi [mailto:pankaj(_dot_)chaturvedi(_at_)idsil(_dot_)com] 
Sent: 03 April 2008 19:23
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: RE: [xsl] Multiple search and replace

Hi,

    <xsl:sequence select="replace(., '\[#x(\d+)\]', 
'&#xE000;#x\1;')" />

Gives error: invalid replace string. The problem I believe is 
at group \1, though I do not see any wrong in regex syntax. 
May be some problem with "Built-in XSLT engine" of XMLSpy.

It seems I need to install java in my system to use Saxon 9 
processor. Any ideas.





-----Original Message-----
From: Abel Braaksma [mailto:abel(_dot_)online(_at_)xs4all(_dot_)nl]
Sent: Wednesday, April 02, 2008 12:03 PM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: Re: [xsl] Multiple search and replace


Hi Pankaj,

see my comments below:


Pankaj Chaturvedi wrote:
Hi all,

I am trying to define multiple search and replace in style sheet.


first thought: consider using XSLT 2.0, which has search and 
replace built in using the replace() function which can 
handle regular expression style search and replace in one call.

Basically trying to convert [#x02010] (and other Unicode values) to 
their corresponding values &#x02010; .


Second thought consider using XSLT 2.0: getting the numeric 
value of a character can be done using string-to-codepoints, 
which is not available in XSLT 1.0. Second thought (b): 
sorry, I see that you mean the literal string '[#...]'....

Below is what I am trying to do:

<snip />

I have two questions in regard:

1. I am bound to define & as &amp; as XMLSpy giving an error 
"character is grammatically unexpected". Is there other way of 
overcoming this issue and get & in output.


Not doing something because your tool limits you is very dangerous...
However, in this case, XMLSpy is correct. The underlying technology
(XML) does not allow a literal &amp;, simply because XSLT 
*is* XML and XML does not allow it. However, if you output as 
text, the serializer will output '&' when you put &amp; somewhere.

Third thought: use XSLT 2.0. It has the ability to add character maps.
In a character map you can say that some character, say '$' 
(but using something from the Private Area Unicode ranges is 
recommended) can be mapped to some string. Using character 
maps you can get a literal '&' in the output.

2. I also need to replace "]" to  ";" for which I was 
trying to call 
the another template with in <xml:template match="text()"> as below 
but
doesn't
seems to be working.

<snip />
Can we do multiple search and replaces in one named 
template or do I 
need
to
define them all separately (I need to call all of them in 
one template 
<xsl:template match="text()">).


XSLT is a functional language. You will have to call the 
replace function recursively. I believe there's an example on 
the exslt.org site which shows how you  can do this for a 
multiple search and replace in a generic way.

Fourth thought: use XSLT 2.0. All you'll end up with then is a nested
replace(replace(....)) call.

Fifth thought: use XSLT 2.0 for the whole shebang. Your whole 
solution will look like this:

<xsl:output use-character-maps="searchreplace" />

<xsl:character-map name="searchreplace">
    <xsl:output-character character="&#xE000;" string="&amp;" 
/> </xsl:character>

<xsl:template match="text()">
    <xsl:sequence select="replace(., '\[#(\d+)\]', 
'&#xE000;#x\1;')" /> </xsl:template>


Sixth thought: use XSLT 2.0. You seem to be using XMLSpy, 
which can handle XSLT 2.0. However, its engine is a bit 
flaky. If you run into problems, consider using either 
Gestalt XSLT 2.0 or Saxon XSLT 2.0 processors.

Note: you may think that putting &amp; inside the 
string-attribute of xsl:output-character creates &amp; in the 
output, but this is not true.
Since XSLT is XML, you must put &amp; there. But to get the 
translation to serialize to literal &amp; instead means 
double escaping: "&amp;amp;"
(but that is not what you are after here). Understanding the 
implications of using character references in XML is vital of 
headache-free working with XML and XSLT (plus all other XML 
related technologies in fact), but it can be hard at times to 
get it right in your head.

Hope this helps,

Cheers,
-- Abel Braaksma


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


Confidentiality Notice:" This message and any attachment(s) 
contained here are information that is confidential, 
proprietary to IDS Infotech Ltd. and its customers.
Contents may be privileged or otherwise protected by law. The 
information is solely intended for the individual or the 
entity it is addressed to. If you are not the intended 
recipient of this message, you are not authorized to read, 
forward, print, retain, copy or disseminate this message or 
any part of it. If you have received this e-mail in error, 
please notify the sender immediately by return e-mail and 
delete it from your computer."

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--