xsl-list
[Top] [All Lists]

Re: [xsl] document( URI ) with accented chars fails

2020-11-17 15:26:28
The document() function expects a URI, not a filename, and URIs never contain 
accented characters.

XSLT 2.0+ has functions to escape special characters using %HH escapes so you 
can turn arbitrary filenames into valid URIs.

For xsltproc you'll need some processor-specific solution and I can't help you 
with that.

Michael Kay
Saxonica

On 17 Nov 2020, at 20:28, Alexandre Hoïde 
alexandre(_dot_)hoide(_at_)bluewin(_dot_)ch 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:


 Hello !

 I have a little problem : URI inside a “document()”¹ is
ignored when the filename contains accented (UTF-8)
character(s).

 When applying the XSLT on the source with the sample
2 files below, with the following command

~~~{Command line}
$ xsltproc multifiles.xsl files-list.xml
~~~

I expect the following result :

~~~{expected result}
<?xml version="1.0" encoding="UTF-8"?>
<root>
 <el>element 1</el>
 <el>element 2</el>
 <el>element 3</el>
</root>
~~~

but I only get the `el`s from the ASCII only filename.

~~~{output}
<?xml version="1.0" encoding="UTF-8"?>
<root>
 <el>element 1</el>
</root>
~~~


~~~{multifiles.xsl}
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; 
version="1.0">
 <xsl:output encoding="UTF-8" indent="yes"/>
 <xsl:template match="/">
   <root>
     <xsl:for-each select="document(/fileslist/filepath)/root/el">
       <xsl:copy-of select="." />
     </xsl:for-each>
   </root>
 </xsl:template>
</xsl:stylesheet>
~~~

~~~{files-list.xml}
<?xml version="1.0" encoding="UTF-8"?>
<fileslist>
 <filepath>filename-without-accented-char.xml</filepath>
 <filepath>filename-with-utf-8-accented-char-é.xml</filepath>
</fileslist>
~~~
(When i add files to the `files-list.xml`, the ones
containing accented chars are consistently ignored.)

~~~{filename-without-accented-char.xml}
<?xml version="1.0" encoding="UTF-8"?>
<root>
 <el>element 1</el>
</root>
~~~

~~~{filename-with-utf-8-accented-char-é.xml}
<?xml version="1.0" encoding="UTF-8"?>
<root>
 <el>element 2</el>
 <el>element 3</el>
</root>
~~~

 Do i miss something or is it a libxslt bug ?

 Thanks for your time !

Alexandre Hoïde

XSLT Processor Version (under Guix GNU/Linux)
 XSL version: 1.0
 Vendor: libxslt
 version: 1.1.34
 (libxml2@2.9.10)
 Vendor URL: http://xmlsoft.org/XSLT/

1. https://www.w3.org/TR/xslt-10/#document

--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--
<Prev in Thread] Current Thread [Next in Thread>