In doing a transcription of some psalms, someone marked up as separate
lines instances where the editor of the print version had wrapped (and
indented) a line.
What I want to do is pass the file through a stylesheet a merge and
lines with 3 or less words into the line before.
If the source file looks something along the lines of :
----
<?xml version="1.0" encoding="UTF-8"?>
<div type="psalm" n="5">
<lg n="5:1">
<l n="1"><w>Myne</w> <w>wordes</w>, <w>lauerd</w>,
<w>with</w> <w>eres</w></l>
<l n="2"><w>byse;</w></l>
<l n="3"><w>Vnderstande</w> <w><c
type="thorn">þ</c>e</w> <w>crie</w> <w>ofe</w> <w>me</w>.</l>
</lg>
<lg n="5:2">
<l n="1"><w>Bihald</w> <w>vnto</w> <w>my</w> <w>bede</w>
<w>steuene</w>,</l>
<l n="2"><w>Mi</w> <w>kynge</w> <w>and</w> <w>my</w>
<w>god</w> <w>ofe</w> <w>heuene</w>.</l>
</lg>
<lg n="5:3">
<l n="1"><w>For</w> <w>to</w> <w><c
type="thorn">þ</c>e</w>, <w>lauerd</w>, <w>bidde</w> <w>sal</w>
.<w>I</w>.<w>;</w></l>
<l n="2"><w>Mi</w> <w>steuene</w> <w>sal</w> <w>tou</w>
<w>here</w> <w>erli</w>.</l>
</lg>
<lg n="5:4">
<l n="1"><w>Erli</w> <w>sal</w> .<w>I</w>. <w>to</w> <w><c
type="thorn">þ</c>e</w> <w>se</w> <w>and</w> <w>stande;</w></l>
<l n="2"><w>For</w> <w>noght</w> <w>god</w> <w>artou</w>
<w>wiknes</w> <w>willande</w>,</l>
</lg>
<lg n="5:5">
<l n="1"><w>Ne</w> <w>wone</w> <w>sal</w> <w>lither</w>
<w>biside</w> <w><c type="thorn">þ</c>e</w>
,</l>
<l n="2"><w>Ne</w> <w>vnrightwise</w> <w>bifor</w> <w><c
type="thorn">þ</c>in</w> <w>eyhen</w> <w>be</w>.</l>
</lg>
<lg n="5:6">
<l n="1"><w><c type="THORN">Þ</c>ou</w> <w>hated</w>
<w>al</w> <w><c type="thorn">þ</c>at</w> <w>wirkes</w>
<w>wiknesse;</w></l>
<l n="2"><w><c type="THORN">Þ</c>at</w> <w>lighe</w>
<w>spekes</w> <w>leses</w> <w>tou</w> <w>mare</w> <w>and</w></l>
<l n="3"><w>lesse</w>,</l>
</lg>
(etc.)
</div>
-----
Now, the way I'm doing it which *seems* to work is:
-----
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:template match="/"><xsl:apply-templates/></xsl:template>
<xsl:template match="node()|@*" priority="-1">
<xsl:copy><xsl:apply-templates select="node()|@*"/></xsl:copy>
</xsl:template>
<xsl:template match="lg">
<lg n="{(_at_)n}">
<xsl:for-each select="l">
<xsl:choose>
<xsl:when test="count(w) > 3">
<xsl:variable name="lineNum"><xsl:number
count="l[count(w) > 3]" from="lg"/></xsl:variable>
<l n="{$lineNum}">
<xsl:apply-templates />
<xsl:if test="following-sibling::l[1][count(w) < 4]">
<xsl:apply-templates select="following-sibling::l[1]"/>
</xsl:if>
</l>
</xsl:when>
<xsl:otherwise/>
</xsl:choose>
</xsl:for-each>
</lg>
</xsl:template>
<xsl:template match="l[count(w) >3]">
<xsl:copy><xsl:apply-templates select="node()|@*"/></xsl:copy>
</xsl:template>
<xsl:template match="l[count(w) < 3]">
<xsl:apply-templates />
</xsl:template>
</xsl:stylesheet>
-----
I'm just wondering if this is having any unforseen side-effects that
I'm not noticing?
In 150 psalms there are only about 20 instances of lg/l's containing
less-than 4 words which are in fact real lines. The rest should be
merged. I figured it was easier to go and correct these 20 after
automatically fixing the hundreds (a few per psalm) which are wrong.
Is this the best way to do it?
-James
--
James Cummings, Cummings dot James at GMail dot com