xsl-list
[Top] [All Lists]

Re: [xsl] Using XSLT to process a directory of mixed files

2019-05-07 21:54:27
If you use the map-producing version of the Saxon collection extension then 
you'll get a map one for each file, that gives you more metadata. See the Saxon 
documentation for details: 
http://saxonica.com/documentation/index.html#!sourcedocs/collections

tl;dr: use metadata=yes to get the metadata map(s) for the selected resources. 
You can use e.g. map:keys() to inspect the map entries to see what metadata you 
have.

Cheers,

E.
--
Eliot Kimber
http://contrext.com
 

On 5/7/19, 9:39 PM, "dvint(_at_)dvint(_dot_)com" 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:

    I'm trying to use a collection() to process all files in a directory. The 
directory may have text, pddf, images files in addition to my DITA file. I've 
created this little test
    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
        xmlns:xs="http://www.w3.org/2001/XMLSchema";
        exclude-result-prefixes="xs"
        version="2.0">
        
        <xsl:variable name="fileSet" 
select="collection('/Users/danvint/pubsrc-other/formatting-sample?select=*.*;recurse=yes')"/>
        <xsl:template match="/">
            <xsl:apply-templates select="$fileSet" mode="collectionprocessing"/>
            
        </xsl:template>
        
        <xsl:template match="/" mode="collectionprocessing">
            '<xsl:value-of select="document-uri()"/>' <xsl:value-of 
select="doc-available(document-uri())"/>
        </xsl:template>
    </xsl:stylesheet>
    
    
    It seems to do what I expect for XML files with results like this
    
           
'file:/Users/danvint/pubsrc-other/formatting-sample/glossentry-adapter.dita' 
true
            
'file:/Users/danvint/pubsrc-other/formatting-sample/conaction/reuse-push-ds-config-tool.dita'
 true
            
'file:/Users/danvint/pubsrc-other/formatting-sample/conaction/reuse-update-server.dita'
 true
            
'file:/Users/danvint/pubsrc-other/formatting-sample/submap-ping_id_examples.ditamap'
 true
            
'file:/Users/danvint/pubsrc-other/formatting-sample/concept_PDabouttheexplodedindexformat.dita'
 true
    
    
    
    But then I have some odd things. It looks like I hit a binary file of some 
sort, based upon the output, but I was just trying to get the file names in 
this script
    
            
'file:/Users/danvint/pubsrc-other/formatting-sample/concept_PAWeb_Access_Management_Agent_Deployment.dita'
 
trueAAAAAUJ1ZDEAABAAAAAIAAAAEAAAAAIJAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgAAAAIAAAAAAAAAAAAAAAAAAAAAAAAAAACAAAAAAAAAAIAAAABAAAQAHNwYmxvYgAAAPZicAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    .... lots of lines here similar to above
    
mvQrxFWXHxD6hgAEIAABCGwnAXuvGsvOvVhNBYKutU2nnqv2YZ2rz04qQ7Rm8AoBCEAAAhCAAATemQBq5p0R0gAEIAABCEAAAhA4J4C4OmfBEQQgAAEIQAACEHhnAoird0ZIAxCAAAQgAAEIQOCcAOLqnAVHEIAABCAAAQhA4J0JIK7eGSENQAACEIAABCAAgXMCiKtzFhxBAAIQgAAEIACBdyaAuHpnhDQAAQhAAAIQgAAEzgkgrs5ZcAQBCEAAAhCAAATemcD/B/Gl121mZIjuAAAAAElFTkSuQmCC
            
'file:/Users/danvint/pubsrc-other/formatting-sample/gloss_PFadminGlossary.dita' 
true
    
    
    
    I don't know what this chunk of content is. Then there is this odd bit
    
            
'file:/Users/danvint/pubsrc-other/formatting-sample/submap_2-notoc.ditamap' true
            
'file:/Users/danvint/pubsrc-other/formatting-sample/glossentry-openid.dita' 
truesub addTaxonomy {   my $inname = $_[0];     my $tempname = $_[0] .  ".new"; 
my $taxonomy = $_[1];           open my $in,  '&lt;:encoding(UTF-8)',  $inname  
    or die "Can't read old file: $inname!";     open my $temp, 
'&gt;:encoding(UTF-8)', $tempname or die "Can't write new file: $tempname!";    
         while( &lt;$in&gt; )        {               
s/(&lt;head&gt;)/&lt;head\&gt;\n$taxonomy\n/g;                              
print $temp $_;     }            close $temp;            close $in;             
# Replace inout file with temp, remove temp     rename "./" . $tempname, "./" . 
$inname or die "Can't move file $tempname to $inname";  }
            
'file:/Users/danvint/pubsrc-other/formatting-sample/submap-knownissues.ditamap' 
true
            
'file:/Users/danvint/pubsrc-other/formatting-sample/concept_PAPort_Requirements.dita'
 true
    
    
    
    These blobs of odd stuff don't follow the pattern of '' around the file 
name and the test I thought that would tell me if it was an XML file or not. 
There is no true/false provided either.
    
    What I want to build is a list of files (shell script) that would copy 
these other files to a new copy in my processed folder where I will be writing 
the results of other work against the DITA files.
    XSL-List info and archive 
<http://www.mulberrytech.com/xsl/xsl-list>EasyUnsubscribe 
<http://lists.mulberrytech.com/unsub/xsl-list/1278982>
    (by email <>)
    
    
    
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>