I realize this is the XSL list, and don't get me wrong, I *love*
XSLT. And while I'm singing XSLT's (and thus XPath's) praises, this
particular task looks like a fun one to attack with Hans-Jürgen's
FOXpath (which is an extension of XPath to handle the file
system).[1]
But that said, this strikes me as a task better handled by your shell
than you XSLT engine, no? In bash, e.g.,
$ fgrep -f filenames_from_directory_listing.txt dir1/*.xml dir2/*.xml
gives you the answer, as it were, but not in the format you want.
I think to get the results you want (the phrase "[filename] was found
in [filepath]") you have to issue the fgrep command once for each
search term, instead of all-at-once. E.g., I think the following will
do the trick.
$ for fn in `cat filenames_from_directory_listing.txt` ; do fgrep -l -e $fn
dir1/*.xml dir2/*.xml | perl -pe "s,^.*\$,$fn was found in \$&,;" ; done
These methods presume that none of the names in filenames_from_
directory_ listing contain any whitespace.
And, of course, one thing that makes this nice is by just using
`egrep` instead of `fgrep`, you can search for regular expressions,
e.g., "meeting_schema\.(rn[cg]|xsd?|wxs|odd|dtd|(iso)?sch)". :-)
Notes
-----
[1] See
https://www.balisage.net/Proceedings/vol17/html/Rennau01/BalisageVol17-Rennau01.html
Hi this is my first post here - looking for help - apologies if
there's something I've overlooked!
I have a tokenized variable that contains list of filenames from a
.txt of a directory listing. I want to look for those filenames in
a number of xml files in a number of subdirectories. If the
filename is found, I want to output that "filename" was found in
"xmlfile".
There are a lot of xml directories and they are not static. Same
with xml files. The filenames are not tagged in the xml, so I'm
just looking for their plain text occurence in the file.
Any help would be appreciated.
to make the examples easier - I want to use
$filenames_to_find (tokenized list of filenames from a .txt
directory listing)
to search against
dir1/*.xml
dir2/*.xml
with the output being
filename was found in xmlfilename
I'm using an academic version of Oxygen XML so I think I have Saxon
through that and I have the standalone Saxon file for running this
from the command line.
I've gotten this far, but it doesn't work. I know it's broken, but
I don't know how to fix it!
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:h="http://www.w3.org/1999/xhtml"
exclude-result-prefixes="xs"
version="3.0"
expand-text="yes"
>
<xsl:variable name="filenames_from_directory_listing"
as="xs:string"
select="unparsed-text('filenames_from_directory_listing.txt')"/>
<xsl:variable name="filenames_to_find"
select="tokenize($filenames_from_directory_listing, '\s+')"/>
<xsl:template match="/">
<xsl:for-each select="collection('.?select=*.xml;recurse=yes')"/>
<xsl:variable name="xml_filenames" select="."/>
<xsl:for-each select="$filenames_to_find">
<xsl:if test="(contains($t, .))">
<xsl:message>{document-uri($xml_filenames)} contains {.}</xsl:message>
</xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Any suggestions? Clearly I am an XSL novice. Thanks for your patience.
--
Syd Bauman, NRP
Senior XML Programmer/Analyst
Northeastern University Women Writers Project
s(_dot_)bauman(_at_)northeastern(_dot_)edu or
Syd_Bauman(_at_)alumni(_dot_)Brown(_dot_)edu
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--