xsl-list
[Top] [All Lists]

Re: [xsl] document( URI ) with accented chars fails

2020-11-18 06:16:06
We use this in bash scripts:

urlencode()
{
    python -c 'import urllib, sys; print urllib.quote(sys.argv[1] if
len(sys.argv) > 1 else sys.stdin.read()[0:-1])' "$1"
}

On Wed, Nov 18, 2020 at 12:32 PM Alexandre Hoïde
alexandre(_dot_)hoide(_at_)bluewin(_dot_)ch 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com>
wrote:

On Tue, Nov 17, 2020 at 10:51:25PM -0000, Alexandre Hoïde 
alexandre(_dot_)hoide(_at_)bluewin(_dot_)ch wrote:
On Tue, Nov 17, 2020 at 09:26:23PM -0000, Michael Kay 
mike(_at_)saxonica(_dot_)com wrote:
The document() function expects a URI, not a filename, and URIs never 
contain accented characters.

XSLT 2.0+ has functions to escape special characters using %HH escapes so 
you can turn arbitrary filenames into valid URIs.

For xsltproc you'll need some processor-specific solution and I can't 
help you with that.

It is not directly XSLT related, but just in case :

The EXSLT has a `str:encode-uri`¹ function but, unfortunately,
`xsltproc` from `libxslt` does not implement it.

So, I have now enriched my bash script used to build
the fileslist.xml with a small Perl script including the Perl
module “URI”², and applied to each file path.

~~~{filename-to-uri.pl}
#!/usr/bin/env perl
use URI::file;
my $uri = URI::file->new( $ARGV[0] );
print $uri . "\n";
~~~

applied on each file name with :
~~~{bash command line}
$ perl filename-to-uri.pl <the-filename-to-convert-to-uri>
~~~

Best regards and thanks again,
Alexandre Hoïde

1. http://exslt.org/str/functions/encode-uri/index.html
2. https://metacpan.org/pod/URI
   (on GNU Guix the package is `perl-uri`)

--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>