On 08/12/2011 21:16, Karlmarx R wrote:
the<II .> and<2 .> have "space" between the dot and previous
letters. If I had texts WITHOUT SPACE, like<II.> o
well that's because unlike <II .>, <II.> is a legal XML start tag, so it
parses that way, for the same reason that <b> in your example parsed as
a tag.
If you know your elements don't have . in their name then you could take
a local copy of htmlparse (or xsl:import it) and modify the regexp that
recognises element names not to include "."
you could change
<xsl:variable name="d:elem"
select="'(\i\c*)'"/>
to
<xsl:variable name="d:elem"
select="'([a-zA-Z][a-zA-Z0-9]*)'"/>
for example, if you only need ascii letters and digits
David
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--