xsl-list
[Top] [All Lists]

Re: [xsl] generating Office Open XML parts using xslt

2014-07-28 11:01:16
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 28.7.2014 15:14, Wendell Piez wapiez(_at_)wendellpiez(_dot_)com wrote:
This is fantastic ... and brings up the related question -- how
about going the other way, reading data out of XSLX format?

Well, this is actually pretty easy if you use Java based XSLT engine.
In Java you can prepend jar: before URI and it will allow direct
access to files stored inside ZIP file (and OOXML files are just ZIP
files with additional metadata). Several lookups are necessary to find
proper files in OPC, but it's perfectly doable. I don't have XSLX
example at hand, but please find bellow example of reading some
statistical data from DOCX file. With XSLX you can do similar thing.
And sorry for Czech comments, but code should be understandable
without them as well.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                version="2.0"

xmlns:r="http://schemas.openxmlformats.org/package/2006/relationships";

xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main";
                xmlns:dc="http://purl.org/dc/elements/1.1/";
                exclude-result-prefixes="r w dc">

  <!-- Parametr pro předání adresy dokumentu, který chceme zpracovávat -->
  <xsl:param name="url">file:../wordprocessingml/zahlavi.docx</xsl:param>

  <!-- Proměnné zastupující jednotlivé části uvnitř OOXML souboru -->
  <!-- Schéma jar: umožňuje transparentní přístup k archivum ZIP/JAR -->
  <xsl:variable name="rels"
                select="doc(concat('jar:', $url, '!/_rels/.rels'))"/>

  <!-- URI hlavní části v balíčku -->
  <xsl:variable name="mainPartUri"
                select="$rels/r:Relationships/r:Relationship[@Type =
'http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument']/@Target"/>

  <!-- Dokument hlavní části -->
  <xsl:variable name="doc"
                select="doc(concat('jar:', $url, '!/', $mainPartUri))"/>

  <!-- URI části s metadaty -->
  <xsl:variable name="metaPartUri"
                select="$rels/r:Relationships/r:Relationship[@Type =
'http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties']/@Target"/>

  <!-- Dokument s metadaty -->
  <xsl:variable name="meta"
                select="doc(concat('jar:', $url, '!/', $metaPartUri))"/>


  <!-- Šablona, kterou transformace začíná -->
  <xsl:template name="stat">
    <html>
      <head>
        <title>Statistika dokumentu
          <xsl:value-of select="$meta/*/dc:title"/>
        </title>
      </head>
      <body>
        Název dokumentu: <xsl:value-of select="$meta/*/dc:title"/><br/>
        Autor dokumentu: <xsl:value-of select="$meta/*/dc:creator"/><br/>
        Počet odstavců: <xsl:value-of select="count($doc//w:p)"/><br/>
      </body>
    </html>
  </xsl:template>

</xsl:stylesheet>


Of course, it would be nice to have set of XPath functions providing
easier API for access to all data in documents.

                                Jirka

- -- 
- ------------------------------------------------------------------
  Jirka Kosek      e-mail: jirka(_at_)kosek(_dot_)cz      http://xmlguru.cz
- ------------------------------------------------------------------
       Professional XML consulting and training services
  DocBook customization, custom XSLT/XSL-FO document processing
- ------------------------------------------------------------------
 OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 rep.
- ------------------------------------------------------------------
    Bringing you XML Prague conference    http://xmlprague.cz
- ------------------------------------------------------------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlPWc6gACgkQzwmSw7n0dR6wNQCfcRF3kbKez14D5+63sfm+u/g3
yRIAoIwLU9pE4ZRqLnTfjHZB45c4mx5Z
=gr+z
-----END PGP SIGNATURE-----
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>