Using Saxon 9.8.0.12 in Oxygen
Style sheet version="2.0"
Problem domain is getting a dictionary created in Word with only <p>s, <span>s
and <b>, and <i> along with some color added to some spans.
In plain text it looks like:
#-a (dem. adj. of proximity)
variant of -ad
#a-1(+a.f./i.a. verb)
1. so, in order that perhaps <D2>
2. (particle introducing a.f., indicating `near future' or `future
possibility') <Asp1.19> <D2>
Variant Forms:
ad-(+a.f./i.a. verb) (in 1st person singular and third person plural)
1. so, in order that perhaps
2. (particle introducing a.f., indicating `near future' or `future possibility')
riɣ ad-ftuɣ I want to go.
ira a-t-iẓr He wants to see it.
a-ka-(+a.f.) if only
a(d)-ur-(+a.f./i.a.)
so, lest, in order that perhaps not
(also introduces neg. imp.: "Do not...")
a-ur-imil-(+a.f./i.a.)
perhaps, in order that, in the hope that; lest, maybe it would happen that
ad-ukʷan- (+a.f./i.a.)
1. when, as soon as <Asp1.24> <Na3.10.6>
2. just, repeatedly <Na3.16.2>
ad-ur- (+a.f./i.a.)
so, lest, in order that perhaps not
(also introduces neg. imp.: "Do not...")
Variant Forms:
ad-
Turn this into a flat file suitable to import into a dictionary processing
program called FLEx.
Something like:
\lx -a
\gi (dem. adj. of proximity)
\vao -ad
\lx a-
\hm 1
\co (+a.f./i.a. verb)
\sn 1
\de so, in order that perhaps \so <D2>
\sn 2
\gi (particle introducing a.f., indicating `near future' or `future
possibility')
\so <Asp1.19>
\so <D2>
\sh Variant Forms:
\va ad-
\co (+a.f./i.a. verb)
\gi (in 1st person singular and third person plural)
\sn 1
\de so, in order that perhaps
\sn 2
\gi (particle introducing a.f., indicating `near future' or `future
possibility')
\xv riɣ ad-ftuɣ
\xe I want to go.
\xv ira a-t-iẓr
\xe He wants to see it.
\va a-ka-
\co (+a.f.) if only
\va a(d)-ur-
\co (+a.f./i.a.)
\de so, lest, in order that perhaps not
\gid (also introduces neg. imp.: "Do not...")
\va a-ur-imil-
\co (+a.f./i.a.)
\de perhaps, in order that, in the hope that; lest, maybe it would happen that
\va ad-ukʷan-
\co (+a.f./i.a.)
\sn 1
\de when, as soon as
\so <Asp1.24>
\so <Na3.10.6>
\sn 2
\de just, repeatedly
\so <Na3.16.2>
I have processed the html output from word into the following snippet:
\entry_number 00001
\lx -a
\vernacular FALSE
\grammatical_info dem. adj. of proximity)
\variant_of -ad
\entry_number 00002
\lx a-
\hm 1
\vernacular FALSE
\co (+|ga a.f.|r |ga i.a.|r verb)
\senseStart 1
\definition so, in order that perhaps
\source D2
\senseStart 2
\grammatical_info particle introducing |ga a.f.|r , indicating `near future' or
`future possibility')
\source Asp1.19
\source D2
\sectionHead Variant Forms:
\variant ad-
\co (+|ga a.f.|r |ga i.a.|r verb
\grammatical_info in 1|sup st|r person singular and third person plural)
\senseStart 1
\definition so, in order that perhaps
\senseStart 2
\grammatical_info particle introducing |ga a.f.|r , indicating `near future' or
`future possibility')
<<<<<< above is correct
\example riɣI want to go. <<<<<< what I get
\example iraHe wants to see it.
\example riɣ ad-ftuɣ <<<<< what I am looking for.
I need two more words here. ad-ftuɣ
\translation I want to go.
\example ira a-t-iẓr
\translation He wants to see it.
The exact slash codes are not important. Getting ALL the data across is.
I have only added the Arial class so far on this instead of <span
style="font-family:"Arial",sans-serif" lang="EN-GB"> it is <span
class="Arial">
I am starting with this snippet of code in HTML.
<p> ...
<span class="Arial">verb) (in 1<sup>st</sup>person singular and
third person plural)
<br />1. so, in order that perhaps
<br />2. (<i>particle introducing a.f., indicating `ne ar future' or
`future possibility'</i>)
<br />
</span>
<span class="MsoHyperlink">
<b>
<span lang="EN-GB">riɣ</span>
</b>
</span>
<b>
<span lang="EN-GB">ad-</span>
<span class="MsoHyperlink">
<span lang="EN-GB">ftuɣ</span>
</span>
</b>
<span class="Arial">I want to go.<br />
.....
</p>
My guess so far is to match the <br/> and then look for <b> words following but
don’t include <b> after <span class="Arial" that turns into \translation .
<xsl:template match="html:br">
<xsl:element name="span">
<xsl:attribute name="class">example</xsl:attribute>
<xsl:value-of select="following::html:b"/> <<<<<<<<<<<< this
gives too many
</xsl:element>
</xsl:template>
I hold the slash code in the class attribute until the last step. That way I
can continue working on the file in XML.
How do I restrict the <xsl:value-of select="following::html:b"/> to just the
ones before the next
<span class="Arial">I want to go.<br />
Thank you
Jim Albright
704-562-1529 unlimited cell
Wycliffe Bible Translators
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--