xsl-list
[Top] [All Lists]

Re: [xsl] How to convert XML doc from UTF-8 to ISO-8859-1 char encoding?

2010-01-11 08:52:28
At 2010-01-11 15:30 +0100, Ben Stover wrote:
Assume I have a XML doc which is UTF-8 encoded.

Can I convert it somehow to ISO-8859-1 encoding?

You can create a new document using XSLT, copying the nodes, reconstituting the document using the new encoding (if your XSLT processor accepts your request for encoding for the result).

And how to  encode it the opposite direction?

Same way.

Normally the encoding is defined in an attribute in the top most <xml> tag.

False.

The XML Declaration at the beginning of XML documents informs an XML processor about the syntax of the XML document. It is not a tag. It does not have attributes.

The XML Declaration has the same syntax as a processing instruction, and the parameters of the declaration borrow the same syntax as attributes, but the XML Declaration does not show up in the XPath data model for XML because it isn't data ... it is syntax.

Is there a way to detect if this declaration is true and corresponds with the real encoding in the full XML doc?
Or if it is faked/misplaced by mistake?

If the encoding used in the document does not match the encoding implied by the XML Declaration then at best your XML processor will abend and report an error, or at worst your XML processor will not encounter an encoding error and you will end up with the wrong characters in your data model for your XML and you won't know they are wrong.

For example, if you have ever seen a capital A with diaeresis "Ä" followed by a copyright symbol, that is because the UTF-8 encoded copyright symbol has been successfully interpreted as two ISO-8859-1 characters. Which is wrong, but it is not in error in the ISO-8859-1 encoding, so you get wrong data with no error message.

Character encoding integrity is not the XML processor's responsibility, but the creator's responsibility. The XML processor must take the XML Declaration at face value.

I hope this helps.

. . . . . . . . . . . . . Ken

--
UBL and Code List training:      Copenhagen, Denmark 2010-02-08/10
XSLT/XQuery/XPath training after http://XMLPrague.cz 2010-03-15/19
XSLT/XQuery/XPath training:   San Carlos, California 2010-04-26/30
Vote for your XML training:   http://www.CraneSoftwrights.com/s/i/
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/s/
Training tools: Comprehensive interactive XSLT/XPath 1.0/2.0 video
Video lesson:    http://www.youtube.com/watch?v=PrNjJCh7Ppg&fmt=18
Video overview:  http://www.youtube.com/watch?v=VTiodiij6gE&fmt=18
G. Ken Holman                 mailto:gkholman(_at_)CraneSoftwrights(_dot_)com
Male Cancer Awareness Nov'07  http://www.CraneSoftwrights.com/s/bc
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>