xsl-list
[Top] [All Lists]

RE: [xsl] possible workarounds to process files with invalid character encoding ...

2008-12-12 16:27:15

If you're capable of writing a Java Reader that will process this file into
a stream of characters, then you can get Saxon to use this Reader by
nominating a custom UnparsedTextURIResolver.

Alternatively, I suspect you can do it at the Java level by registering an
encoding name for the encoding and associating it with a decoder for that
encoding - but I'm not familiar with the details.

Michael Kay
http://www.saxonica.com/ 

-----Original Message-----
From: Matthias Einbrodt 
[mailto:matthias(_dot_)einbrodt(_at_)meinbrodt(_dot_)net] 
Sent: 12 December 2008 21:14
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] possible workarounds to process files with 
invalid character encoding ...

Hello,

I'm trying to transform a textfile with xslt using the 
unparsed-text and tokenize functions. Unfortunately the text 
file consists of characters which are encoded with a non 
Unicode compliant encoding scheme. So as expected my Saxon 
Processor (version 9.1.0.3 Basic) shows me a 
*MalformedInputException *when I want to parse the file.

Now my question is if there are any "workarounds" to make 
Saxon process the file anyway. Maybe by:

(1) Writing a sort of plugin that let's Saxon support also 
non Unicode compliant encodings;

(2) By adding in some way Metadata to the input file which 
Saxon or another XSLT Parser can handle and that specifies a 
mapping of the used character encodings to the appropriate 
code points of a Unicode compliant encoding.

And if there exists such a workaround is it even worth trying 
to implement it or would someone be better of preprocessing 
the file with a custom Java-Program or by even trying to 
modify the program that creates such text-files in such a way 
that it uses a Unicode-compliant encoding scheme rather than 
it's own custom one?

What are your opinions?

Best Regard

Matthias Einbrodt




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>