xsl-list
[Top] [All Lists]

Re: [xsl] Flat to Structured: Handling List Items with Subordinate Paragraphs

2009-05-26 17:17:51

On May 26, 2009, at 5:11 PM, Robert Koberg wrote:

Hi,

(gone through this many times.) Usually, I find the easiest thing to do is open in Open Office and export as XHTML. That way you get the structure you want and then whittle the rest of the junk away till you get it to conform to some schema, maybe splitting out content pieces based off of H1s (we sometimes get whole websites written in Word).

I might not have been clear - I meant that we use XSL to remove the unnecessaries. (don't want to get yelled at :) )

Start out with the identity template and any obvious matches. Remove Ps that only contain whitespace. Remove pretty much all attributes. Remove many unnecessary SPANs. Doesn't take long: edit the xsl, run the transform, check validity, rinse, wash, repeat.

best,
-Rob



Another thing we do is just paste the Word content into our web based editor - Xopus - and it does the work to convert it to the current XML Schema. Does a really good job, but there is usually some clean up which is done by the author.

best,
-Rob

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--