xsl-list
[Top] [All Lists]

RE: [xsl] Saxon 9.4 <bold></bold> Transformed to (newline)</bold> Problem

2013-01-16 12:53:30
On Wed, 2013-01-16 at 17:53 +0000, Raymond Lillibridge wrote:
Here is an update for this thread.

Due to my need to process the output file further, using Perl (reading
line by line), I need the paragraph content to not have any newlines
or white-space introduced.  I also cannot use the @indent="no"
attribute due to existing post process Perl applications.

Here's an (untested) example getline() that turns <bold>
</bold> into <bold/>.

The technique is to read a line at a time but to read multiple lines
when necessary.

Better yet is not to use line-at-a-time processing at all on a file
format that's not line oriented... however, I do this a lot as part of
converting line-oriented texts into XML.

Liam

sub getline()
{
    my $line = <>;

    return undef unless $line;

    while ($line =~ m{<bold>[^<>]*$}) {
        my $tmp = <>;
        if (!defined $tmp) { # EOF
           die "end of input inside <bold> element! oh dear!";
        }
        $line .= $tmp;
    }
    $line =~ s{<bold>\s*</bold>}{<bold/>}g;
    return $line;
}



-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/
Ankh: irc.sorcery.net irc.gnome.org freenode/#xml


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--