xsl-list
[Top] [All Lists]

Re: [xsl] Representing EBCDIC code 37 in xslt

2013-12-31 15:39:10
The FTP may be treating the file as some kind of ASCII. UTF-8 is a
superset of 7 bit ascii, so most of the time the conversion works
(when it really isn't).

It may be better to generate the file as EBCDIC on the unix box
(which, yes, will look like gibberish) and then transfer it as binary.
 That way you can confirm that the bit patterns that you want are
actually there before doing the transfer.  Fixing the FTP program's
text conversion may not be easy.

On Wed, Jan 1, 2014 at 12:50 AM, a kusa <akusa8(_at_)gmail(_dot_)com> wrote:
Thank you Greg.

The source file has utf-8 characters. But the problem seems to be
happening when FTPing the converted text file to mainframe. Mainframe
is not retaining it. So, I will have to check on the mainframe side to
see if there is any setting that can be manipulated.

Thank you all for your time and input.



On Mon, Dec 30, 2013 at 3:59 PM, Greg Hunt <greg(_at_)firmansyah(_dot_)com> 
wrote:
The characters do not exist independently of the encoding of the
characters that are around them.    What you are trying to do, it
appears, is to construct a file containing a mix of ascii/utf-8
characters and ebcdic characters, and then pass that file through a
characterset conversion that has no idea that the "ebcdic" characters
are in there.  What it will do is either corrupt the characters in
some interesting way or replace them with some kind of substitution
character - control-z, a question mark, a full stop, or unicode code
point fffd depending on the source and target encodings (in reality,
in a file, there are only bit patterns, not characters, there is
nothing to mark one sequence of bits as one character set encoding or
another) .

The file has to be all the same character set of it is to pass through
an Ascii/ebcdic conversion undamaged.  If you make it ebcdic on your
unix platform it needs to look like gibberish because the bit patterns
for ebcdic are not the same as the bit patterns for either utf-8, 8859
or 1252 and the unix box will not understand them.  If the characters
can be represented as utf-8, 8859-1 or 1252 (the R symbol is present
in all of them so it ought to be ok) and you already have transcoding
happening to ebcdic then you either have to use the some transcoding
to convert the characters (provided that your transcoding is actually
working on 8-bit 8859 or 1252 and not some ancient 7 bit idea of
ascii) or you need to make a file with the right ebcdic bit patterns
in it and pass it around as binary.

On Tue, Dec 31, 2013 at 7:59 AM, a kusa <akusa8(_at_)gmail(_dot_)com> wrote:

Thanks Ivan. That is where this question started, what output encoding
can I use to preserve these EBCDIC characters?

On Mon, Dec 30, 2013 at 2:12 PM, Ivan Shmakov 
<oneingray(_at_)gmail(_dot_)com> wrote:
a kusa <akusa8(_at_)gmail(_dot_)com> writes:

[…]

 > Well, I have <xsl:output encoding> set to utf-8 right now.  If I set
 > it to EBCDIC, then the rest of the content in the XML converts to
 > gibberish.

        Which is expected, if you view an EBCDIC-encoded XML file with
        an application that assumes ASCII-based encoding.  Try to upload
        the resulting file using FTP /binary/ mode to the mainframe and
        check if the file is still unreadable /there./

        (Alternatively, or perhaps complementarily, use an
        EBCDIC-capable application to view the resulting file locally.)

 > Thats what I meant.

 > I only need the special characters -esp. Latin-1 characters like the
 > plusminus sign, to convert to the right EBCDIC code.

 > I have a java program that FTPs the file; I believe the default is
 > ASCII.

        There /may/ be a problem if /either/ this program or the FTP
        server assume that the input is ASCII, because the characters
        such as PLUS-MINUS SIGN are /not/ representable in ASCII.

        One solution is to configure either the FTP client or FTP server
        to /correctly/ convert UTF-8 to EBCDIC.  The other is to
        configure the XSLT implementation (with <xsl:output />) to
        output EBCDIC, and send the result to the target host /without/
        any encoding conversion (i. e., using FTP binary mode.)

--
FSF associate member #7257

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--