xsl-list
[Top] [All Lists]

Re: [xsl] Integrating a Search and Replace template with the CSV to XML converter

2008-06-03 00:58:32
Thank you so much Michael for you detailed response!

I have thrown myself into XSLT and XML without any prior knowledge, and seem
to have missed quite a few of the basics!

I will look further into encoding types for my own benefit, but what you
have suggested below works absolutely perfectly.

I would like to post my XSLT stylesheet on a template exchange at
www.dnndev.com to be used in conjunction with the Dot Net Nuke add-on module
Xmod. There is a real need for this type of transform in this community.

Andrew, do you have a problem with this. I will make sure you have full
credit!

Kindest Regards,
Marney




On 3/6/08 5:43 PM, "Michael Kay" <mike(_at_)saxonica(_dot_)com> wrote:

The characters that are effecting things are part of the
UNICODE set 'General Punctuation'. This is translating
through the stylesheet fine and is being displayed in the
resulting XML by &#146; (right hand quote) and &#150; (en
dash). Problem is, my dynamic website does not know how to
display these characters, and I am getting the little boxes instead.

It's not surprising that it doesn't know how to display them, since neither
of these codepoints is assigned to any printable Unicode character. The
Unicode codepoint for en dash is x2013; the code for "right single quotation
mark" is x2019. 

What has happened is that your input uses the Microsoft-proprietary cp1252
character encoding. There's no harm in that, provided that the software
reading the file knows it's in this encoding, so that it can translate such
characters to their proper Unicode values for use in the output XML.

I am thinking of integrating a Global Search and Replace
template that runs on the final XML to find all instances of
&#146; and replace with ' .

No, you should fix the problem at source rather than patching it up later.
If you're reading the CSV file using unparsed-text(), and if the CSV file is
in cp1252 encoding, then you can specify this in the encoding parameter to
unparsed-text().

Michael Kay
http://www.saxonica.com/


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



Marney Cotterill
graphic designer
                   
cracker//brandware

6 Bourke Street
Queens Park 
NSW 2022
Telephone 02 9387 2001
Facsimile 02 9387 2006
marney(_at_)crackerbrandware(_dot_)com
www.crackerbrandware.com



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>