xsl-list
[Top] [All Lists]

RE: Converting delimited text WITH <br> to string

2004-06-21 22:06:05
Hi,

Friends,
I guess I missed the answer to this one. I have read a lot of FAQs,
but I have not found my particular answer.

All I want to do is to compare an XML file with a text file.

My desire is to convert the text file into a string then compare the
data in it to the XML nodes. However, the text file always gets
parsing errors.

Bacause you're trying to parse something that's not XML with an XML parser.

The text file has is exported from a OLD database, but the fields do
have <br> and other sloppy html in them.

Then you have to clean it first by removing the HTML tags, or by converting the 
"document" into XML (XMLized HTML or XHTML).
 
I would edit them, but there are over 300 of them all in different
folders (lucky for me they are on the same server).

Here is the URL.

http://lcweb2.loc.gov/music/ftp/951201/06180001/ftscript.data

Exactly what do you want to compare, and what does the XML you want to compare 
with look like.

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE xsl:stylesheet [ 
<!ENTITY lll SYSTEM
"http://lcweb2.loc.gov/music/ftp/951201/06180001/ftscript.data";>
<!ENTITY nbsp "&#x20;">
<!ELEMENT br (EMPTY)>
<!ELEMENT BR (EMPTY)>

Declaring the elements will not help you with the parsing errors, because the 
file is not XML.

]>

<xsl:stylesheet 
      version="1.0" 
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; 
      xmlns:xs="http://www.w3.org/2001/XMLSchema"; 
      xmlns:html="http://www.w3.org/1999/xhtml"; 
      exclude-result-prefixes="html xs" 
  xmlns:saxon="http://icl.com/saxon";
  extension-element-prefixes="saxon"

<xsl:output 
      version="1.0" 
      method="html" 
      indent="yes" 
      encoding="utf-8" 
      omit-xml-declaration="no" 
      standalone="no" 
      media-type="text" 
      cdata-section-elements="br"
/>

<xsl:template match="/">
<X>
<xsl:copy-of
select="document('http://lcweb2.loc.gov/music/ftp/951201/06180
001/ftscript.data')//*"/>
<xsl:apply-templates
select="document('http://lcweb2.loc.gov/music/ftp/951201/06180
001/ftscript.data')//text()
| *"/>
<xsl:copy-of
select="document('http://lcweb2.loc.gov/music/ftp/951201/06180
001/ftscript.data')//text()"/>

What should the above do? Or rather, what do you want the above to do?

<xsl:text>&lll;</xsl:text>
</X>
</xsl:template>

</xsl:stylesheet>

Cheers,

Jarno - Cubanate: Transit


<Prev in Thread] Current Thread [Next in Thread>
  • RE: Converting delimited text WITH <br> to string, Jarno.Elovirta <=