xsl-list
[Top] [All Lists]

Re: XSL for WordML -> specific HTML

2005-05-06 08:08:52
Hi, Stephen,

Those old stylesheets were just experiments on my part, so I don't still 
have them. Sorry about that. I never got all that far anyway, since I 
didn't have a client's need pushing me. I got to the point of turning a 
simple document into an HTML or PDF file but not to the point of handling 
tables. Send along samples of the other things you need to handle (tables, 
for example), and I'll be happy to help you figure out how to handle them.

Also, I didn't read your example closely enough, so I missed that w:pPr 
and w:r are siblings. That's why the templates I wrote picked up nodes 
they shouldn't. Sorry some more.

Here's a corrected stylesheet:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; 
xmlns:w="http://schemas.microsoft.com/office/word/2003/2/wordml"; 
xmlns:wx="http://schemas.microsoft.com/office/word/2003/2/auxHint"; 
exclude-result-prefixes="w wx">

  <xsl:template match="wx:sect">
    <html>
      <head>
        <title>WML to HTML Test</title>
        <link rel="StyleSheet" type="text/css" href="yourstyles.css" />
      </head>
      <body>
        <xsl:apply-templates/>
      </body>
    </html>
  </xsl:template>

  <xsl:template match="wx:subsection">
    <xsl:apply-templates/>
  </xsl:template>

  <xsl:template match="w:p[w:pPr/w:pStyle/@w:val='Heading1']">
    <div class="heading1"><xsl:apply-templates select="w:r/w:t"/></div>
  </xsl:template>

  <xsl:template match="w:p[w:pPr/w:pStyle/@w:val='Heading2']">
    <div class="heading2"><xsl:apply-templates select="w:r/w:t"/></div>
  </xsl:template>

  <xsl:template match="w:p[not(w:pPr)]">
    <div class="paragraph"><xsl:apply-templates select="w:r/w:t"/></div>
  </xsl:template>

</xsl:stylesheet>

When run against the following sample (your sample fixed up and extended 
to have a root element):

<wx:sect xmlns:w="http://schemas.microsoft.com/office/word/2003/2/wordml"; 
xmlns:wx="http://schemas.microsoft.com/office/word/2003/2/auxHint";>
  <wx:sub-section>
    <w:p>
      <w:pPr>
        <w:pStyle w:val="Heading1"/>
      </w:pPr>
      <w:r>
        <w:t>Hello 1</w:t>
      </w:r>
    </w:p>
    <w:p>
      <w:r>
        <w:t>Some random normal text 1</w:t>
      </w:r>
    </w:p>
  </wx:sub-section>
  <wx:sub-section>
    <w:p>
      <w:pPr>
        <w:pStyle w:val="Heading2"/>
      </w:pPr>
      <w:r>
        <w:t>Hello 2</w:t>
      </w:r>
    </w:p>
    <w:p>
      <w:r>
        <w:t>Some random normal text 2</w:t>
      </w:r>
    </w:p>
  </wx:sub-section>
</wx:sect>

It produced (edited for spacing):

<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>WML to HTML Test</title>
    <link rel="StyleSheet" type="text/css" href="yourstyles.css">
  </head>
  <body>
    <div class="heading1">Hello 1</div>
    <div class="paragraph">Some random normal text 1</div>
    <div class="heading2">Hello 2</div>
    <div class="paragraph">Some random normal text 2</div>
  </body>
</html>

Tested with Saxon 8.4.

HTH

Jay Bryant
Bryant Communication Services
(presently consulting at Synergistic Solution Technologies)




Stephen <azrael(_at_)azrael-uk(_dot_)f2s(_dot_)com> 
05/06/2005 09:29 AM
Please respond to
xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com


To
xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
cc

Subject
Re: [xsl] XSL for WordML -> specific HTML






Thanks for getting back to me Jay... it's so daunting to start doing 
something so totally new - and feels easier to know others have been 
there and might be able to poke you along in the right direction.

The XSL I have written so far is very basic.. and isn't working fully on 
my word document.. and so it is hard to spot what exactly the problem 
is. Sometimes I make an intuitive change - only to get a freaky result 
that makes me doubt how intuitive the change was ;)

Basically what I am looking to do is an xsl that outputs just the 
content into html within div that have a class of the appropriate style. 
I'm not looking to pick up font weights/size and lots of other style 
data word stores. Also I want to grab word tables and output them as 
html tables.. in my head it all seems simple.. but the format of WordML 
makes it all very very convoluted.

Do you have any 'simple' XSL documents that do this sort of thing that I 
could study?

JBryant(_at_)s-s-t(_dot_)com wrote:
Hi, Stephen,

I've done a little WordML to XHTML and WordML to FO, so maybe I can help 
a 
bit.

Your heading templates look fine to me. They'll get you the values of 
the 
headings without anything you don't want. I suppose you think they're 
messy because you don't want to apply multiple templates to the same 
element. However, your current solution works just as well as matching 
w:pPr and applying logic within the template to figure out which heading 

level it is. Personally, I find having one template per type of result 
node to be just as readable as one template per type of source node.

To get the content of the paragraphs, add this to your heading 
templates:

<xsl:template match="w:p/w:r/w:t">
  <p><xsl:apply-templates/></p>
</xsl:template>

A little uncanny - but I used something like that and that duplicates 
the heading level text for me. As it is being picked out by the specific 
template match, and then being picked out by that more generic one. 
Unless there's an XPath for 'match a w:p/w:r/w:t unless the w:p bit 
contains a w:pPr/w:pStyle node' ?

I can attach a small sample of my xml doc - and xsl.. if that'll help 
explain what I have got, and what I am getting? (Or I can dump them on a 
website somewhere)

I used apply-templates there because you may have format elements (bold, 

etc.) within the paragraph. I used the three level match to make sure 
you 
get just body content and not the heading content, too (as matching just 

w:t or w:r/w:t would do).

HTH

Jay Bryant
Bryant Communication Services
(presently consulting at Synergistic Solution Technologies)




Stephen <azrael(_at_)azrael-uk(_dot_)f2s(_dot_)com> 
05/05/2005 09:34 AM
Please respond to
xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com


To
xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
cc

Subject
[xsl] XSL for WordML -> specific HTML






I have a WordML based xml document which contains content along the 
lines of:

<wx:sub-section>
                 <w:p>
                    <w:pPr>
                       <w:pStyle w:val="Heading1"/>
                    </w:pPr>
                    <w:r>
                       <w:t>Hello 1</w:t>
                    </w:r>
                 </w:p>
                 <w:p>
                    <w:r>
                       <w:t>Some random normal text 1</w:t>
                    </w:r>
                 </w:p>
                 <wx:sub-section>
                     <w:p>
                        <w:pPr>
                           <w:pStyle w:val="Heading2"/>
                        </w:pPr>
                        <w:r>
                           <w:t>Hello 2</w:t>
                        </w:r>
                    </w:p>
                    <w:p>
                       <w:r>
                          <w:t>Some random normal text 2</w:t>
                       </w:r>
                    </w:p>

and I want to output that as:

<div style="level1">Hello 1</div>
<p>Some random normal text 1</p>

<div style="level2">Hello 2</div>
<p>Some random normal text 2</p>

Obviously I may have headings all over the document, so I want something 

generic that will pick them all out nicely.

Currently I have a rather messy:

<xsl:template match="w:pStyle[(_at_)w:val='Heading1']">
                 <div class="section_L1">
                                 <xsl:value-of 
select="../../w:r/w:t/text()"/>
                 </div>
</xsl:template>

<xsl:template match="w:pStyle[(_at_)w:val='Heading2']">
                 <div class="section_L2">
                                 <xsl:value-of 
select="../../w:r/w:t/text()"/>
                 </div>
</xsl:template>

that outputs:

<div class="section_L1"></div><div class="section_L2"></div><div 
class="section_L1">Hello 1</div>Some random normal text 1<div 
class="section_L2">Hello 2</div>Some random normal text 2

Anyone have any useful ideas?




-- 
    Azrael

            ("\''/").___..--'''"-._
            `0_ O  )   `-.  (     ).`-.__.`)
            (_Y_.)'  ._   )  `._ `. ``-..-'
          _..`--'_..-_/  /--'_.' .'
         ((i).-''  ((i).'  (((.-'

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



<Prev in Thread] Current Thread [Next in Thread>