xsl-list
[Top] [All Lists]

RE: [xsl] Tranformation failed with Saxon for "Illegal HTML character"

2006-07-28 14:41:00
The Euro symbol is not decimal 128 in Unicode. It is decimal 128 in some 
Microsoft character set whose name I have forgotten. The Unicode character 128 
is not a legal HTML character.

You need to make sure that the character encoding of the XML file is correctly 
declared: if you are using a particular Microsoft codepage, then you need to 
say so in the XML declaration.

There was a significant controversy in W3C about the rule that invalid HTML 
characters must be treated as a fatal error by XSLT processors. I argued for 
leniency, but the view that prevailed was that the sooner you catch misencoded 
files (or files whose encoding is misdeclared), the better it is for the user 
in the long run. 

Michael Kay
http://www.saxonica.com/ 


-----Original Message-----
From: Gian Luca Paloni [mailto:gianluca(_dot_)paloni(_at_)objectway(_dot_)it] 
Sent: 28 July 2006 17:12
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Cc: desantis(_at_)objectway(_dot_)it
Subject: [xsl] Tranformation failed with Saxon for "Illegal 
HTML character"


Hi all,

i use Saxon ver. 8.7.3 as engine to make xslt transformations.

Here’s a sample code:

public static void exampleFromStream(String sourceID, String xslID)
          throws TransformerException, 
TransformerConfigurationException,
                 FileNotFoundException {

      // Create a transform factory instance.
      TransformerFactory tfactory  = 
TransformerFactory.newInstance(); new Boolean(false));

      InputStream        xslIS     =
          new BufferedInputStream(new FileInputStream(xslID));
      StreamSource       xslSource = new StreamSource(xslIS);

      // The following line would be necessary if the 
stylesheet contained
      // an xsl:include or xsl:import with a relative URL
      // xslSource.setSystemId(xslID);

      // Create a transformer for the stylesheet.
      Transformer  transformer = tfactory.newTransformer(xslSource);
      InputStream  xmlIS       =
          new BufferedInputStream(new FileInputStream(sourceID));
      StreamSource xmlSource   = new StreamSource(xmlIS);

      // The following line would be necessary if the source 
document contained
      // a call on the document() function using a relative URL
      // xmlSource.setSystemId(sourceID);

      // Transform the source XML to System.out.
      transformer.transform(xmlSource, new StreamResult(new 
PrintWriter(new FileOutputStream("c://test.html"))));
  }

If I apply the transformation to an XML file which include 
the “€” (euro symbol, decimal 128) I got an error message saying that:
ERROR AT ELEMENT CONSTRUCTOR <SPAN> ON LINE 69 OF :
  SERE0014: ILLEGAL HTML CHARACTER: DECIMAL 128 ; SYSTEMID: ; 
LINE#: 69; COLUMN#: -1
NET.SF.SAXON.TRANS.DYNAMICERROR: ILLEGAL HTML CHARACTER: DECIMAL 128
      AT 
NET.SF.SAXON.EVENT.HTMLEMITTER.WRITEESCAPE(HTMLEMITTER.JAVA:321) ….

Anyone can help me?
Is there a way to tell the transformer just to let unchanged 
and not interpret those special chars??
Thanks in advance to all,

Bye

Gian



--
La presente comunicazione potrebbe contenere informazioni 
riservate e/o protette da segreto professionale ed e' 
indirizzata esclusivamente ai destinatari della medesima qui 
indicati. Se avete ricevuto per errore la presente 
comunicazione, siete invitati a segnalarcelo, rispondendo a 
questo stesso indirizzo di e-mail, e a cancellare il presente 
messaggio dal Vostro sistema. E' strettamente proibito e 
potrebbe essere fonte di violazione di legge qualsiasi uso, 
comunicazione, copia o diffusione dei contenuti di questa 
comunicazione da parte di chi la abbia ricevuta per errore o 
in violazione degli scopi della presente.
Il messaggio e' stato analizzato alla ricerca di virus o 
contenuti pericolosi ed e' risultato NON infetto.


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>