xsl-list
[Top] [All Lists]

RE: Converting &, >, <, ", and other odd-ball characters...

2003-07-23 11:44:36
Hi,

   I came upon this particular named subject by doing a web
search. The person that began the thread was having difficulty within
java converting strange characters into normal character entities.

     http://www.biglist.com/lists/xsl-list/archives/200102/msg00941.html

I'm doing something similar in that I'm reading a text file generated
by MS Word on a Macintosh and I'd like to automatically change the
weird characters using Java.

   The method I'm using to do this is by making an XML configuration
file that contains information on what characters to change, such as:

       <pair from="&#xd2;" to="&amp;lsquo;"/>

That is, if the program finds the data value 0xD2 in the input stream,
it should notice this and replace it with &lsquo; which it did until I
upgraded to j2sdk1.4.1. Now, after parsing the configuration file, the
DOM parser reports that 0xD2 *isn't* 0xD2 but rather is ? (0x3F).

   In a message by Mike Brown, at
http://www.biglist.com/lists/xsl-list/archives/200102/msg00825.html in
this same particular thread, there is mention of escaping the attriute
values:

              You must always escape the attribute values. You can get
              around the need to escape character data content of an
              element by using CDATA sections, but I think you'll find
              that it's actually just as easy to escape
              everything. Entities aren't going to help you.

But is that not what I'm doing above? Or should I make it:

       <pair from="&amp;#xd2;" to="&amp;lsquo;"/>

or is there some method to tell Java to not try and interpret 0xD2 but
just accept it?

Thank you,
Elizabeth

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



<Prev in Thread] Current Thread [Next in Thread>