xsl-list
[Top] [All Lists]

Re: [xsl] CSSXX to XML

2013-03-30 14:59:12
Hi Dorothy,

You might want to try our css expander.

It’s a three-step process:

– Transform the XHTML and its CSS, be it linked, included as style element or style attribute, into an XML representation of the CSS.

– Transform this representation into an XSLT stylesheet, where XSLT matching patterns correspond to CSS selectors and XSLT priority attributes correspond to CSS precedence rules.

– Apply this stylesheet to the original XHTML document.

You can use it as an XProc step. If you check out this sample project with svn, https://subversion.le-tex.de/common/sandbox/css_expand_standalone/trunk/, you can invoke it as described here (replacing calabash.sh with calabash.bat if you’re on Windows): https://subversion.le-tex.de/common/sandbox/css_expand_standalone/trunk/README.txt

Or just check out this directory: https://subversion.le-tex.de/common/css-expand/xslt/ and use it like this (assuming that you have a front-end script for Saxon, called saxon):

saxon -xsl:css-parser.xsl -s:/path/to/file.xhtml -o:css.xml
saxon -xsl:css2xsl.xsl -s:css.xml -o:expand.xsl path-constraint='[self::*:img]'
saxon -xsl:expand.xsl -s:/path/to/file.xhtml -o:expanded.xhtml

You need the path-constraint attribute on the second step only if you want to restrict expansion to the img element.

On the same step, you may specify another parameter, prop-constraint. Example: prop-constraint="width max-width".

You need to further transform the css:* attributes in the expanded output to match your needs.

There is also a Relax NG schema for CSS as XML attributes: https://github.com/gimsieke/CSSa

There is currently no combined XHTML+CSSa schema, though. I should create one, because it’s just cool to be able to validate the CSS property values, as we experience daily when validating DocBook+CSSa with our Hub schema, https://github.com/gimsieke/Hub, deployed here: http://www.le-tex.de/resource/schema/hub/1.1/hub.rng

Since you are also using InDesign, you might want to try our IDML→Hub XML converter, https://subversion.le-tex.de/idmltools/trunk/idml2xml/ Just yesterday, I implemented nested styles (i.e., their resolution to spans with character styles).

Gerrit



On 30.03.2013 20:03, Dorothy Hoskins wrote:
HI, I have an interesting problem in that I am trying to figure out
how to load and process a CSS file to grab content from CSS class
definitions and poke them into XML files.
In the source XML, which is scraped from XHTML pages, I find images
with CSS classes:
<img class="frame-3" src="image/file.jpeg" alt="image" />

In the CSS of the ePub, I find the dimension information that I want
for the image:
img.frame-3 {
     height:448px;
     width:339px;
}

My desired XML output is <image height="44" width="339"
src="image/file.jpeg" alt="image" />

I have the idea of grabbing the CSS and processing the CSS text to
achieve something like this:
<css>
<class element="img" name="frame-3">
<attribute name="height" value="44"/><!-- px assumed in XHTML -->
<attribute name="width" value="339"/>
</class>
</css>

I know I can handle everything else I want to do once I get the CSS
into an XML structure. The commonalities of the CSS text are that a
line which contains "{" has the information I want for the
class/@element and class/@name. The subsequent lines until the "}"
occurs have the content that I want to process into the attribute/name
and attribute/value. It seems like regex is the way to go but I don't
know how to start - do I load the CSS file into a variable as
xs:string? process it as unparsed-text? if anyone knows a good example
of creating such structure from a text input in the archives or
online, please point me in the right direction.
Thanks, Dorothy

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit(_dot_)imsieke(_at_)le-tex(_dot_)de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard Vöckler

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>