xsl-list
[Top] [All Lists]

Re: [xsl] Source code formatting

2020-08-19 08:58:00
I have tended to resort to things like

<xsl:variable name="NL" select="codepoints-to-string(10)"/>

to prevent this kind of thing happening.

Michael Kay
Saxonica

On 19 Aug 2020, at 14:02, Willem Van Lishout 
willemvanlishout(_at_)gmail(_dot_)com 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:

Hi all,

I'm having a little trouble with converting whitespace entities.
This line: 

<xsl:variable name="nl" select="'&#10;'"/> 

Gets converted to:

<xsl:variable name="nl" select="
"/>

Which causes the variable to be output as a space when I do <xsl:value-of 
select="$nl"/> (I suppose because it gets normalized again during parsing).

I wonder what the best way is to solve this problem? It's not important what 
it looks like, but the stylesheet behavior should obviously not change. The 
formatting is done by an XSL stylesheet (which removes whitespace-only 
nodes), which uses the aforementioned patched Xerces parser and then uses a 
custom Saxon serializer to control output indentation and new line settings. 
Is this something I could configure in the serializer?

Thanks,

Willem Van Lishout  
willemvanlishout(_at_)gmail(_dot_)com 
<mailto:willemvanlishout(_at_)gmail(_dot_)com>



On Thu, Jul 30, 2020 at 10:07 AM Willem Van Lishout 
willemvanlishout(_at_)gmail(_dot_)com 
<mailto:willemvanlishout(_at_)gmail(_dot_)com> 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com 
<mailto:xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com>> wrote:
Thanks everyone.
What I did is monkey patch Xerces to skip the normalization for attributes. I 
still end up with &#10; instead of actual carriage returns, but it seems I 
can fix that in XSLT by using a character map.
In my research I found out that the .NET XmlTextReader class allows disabling 
normalization: 
https://docs.microsoft.com/en-us/dotnet/api/system.xml.xmltextreader.normalization?view=netcore-3.1
 
<https://docs.microsoft.com/en-us/dotnet/api/system.xml.xmltextreader.normalization?view=netcore-3.1>
 , perhaps it's useful to somebody...

Willem Van Lishout  
willemvanlishout(_at_)gmail(_dot_)com 
<mailto:willemvanlishout(_at_)gmail(_dot_)com>



On Wed, Jul 29, 2020 at 1:20 PM Michael Kay mike(_at_)saxonica(_dot_)com 
<mailto:mike(_at_)saxonica(_dot_)com> 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com 
<mailto:xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com>> wrote:
Agreed, the attribute normalization spec is an absolute pain, and being able 
to switch it off would have many benefits and no adverse consequences I can 
foresee.

Michael Kay
Saxonica

On 29 Jul 2020, at 09:27, Pieter Lamers 
pieter(_dot_)lamers(_at_)benjamins(_dot_)nl 
<mailto:pieter(_dot_)lamers(_at_)benjamins(_dot_)nl> 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com 
<mailto:xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com>> wrote:

I have the same problem when storing XSLT documents in eXist-db. Attribute 
normalization also kills whitespace there because the spec says whitespace 
should be ignored in attribute values (and eXist-db normalizes against the 
spec). Isn't it about time to change the spec in this respect? 
Pieter
On 28/07/2020 23:18, Willem Van Lishout 
willemvanlishout(_at_)gmail(_dot_)com 
<mailto:willemvanlishout(_at_)gmail(_dot_)com> wrote:
Hi list,

Like many of you, I assume, I use a version control system when working on 
XSLT projects. I'm working together with multiple people, and we run the 
code through an XML formatter before checking it in to avoid formatting 
differences showing up in the diffs.

The problem is that, due to attribute value normalization, carriage returns 
are removed from attribute nodes during XML parsing. When using long XPath 
expressions (and this has become very common in XSLT 3, especially with 
higher order functions), which are split in multiple lines, this results in 
huge single line outputs which are impossible to read.

It seems any sort of XML processing will irreversibly transform the 
whitespace, therefore I have to choose between:
- No formatting
- Formatting using non-XML tools?
- Finding a parser that bends the rules...

Have any of you experienced the same problem and did you find a solution?

Thanks.

Willem Van Lishout  
willemvanlishout(_at_)gmail(_dot_)com 
<mailto:willemvanlishout(_at_)gmail(_dot_)com>

XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/2854576> (by 
email <>)
-- 
Pieter Lamers
John Benjamins Publishing Company
Postal Address: P.O. Box 36224, 1020 ME AMSTERDAM, The Netherlands
Visiting Address: Klaprozenweg 75G, 1033 NN AMSTERDAM, The Netherlands
Warehouse: Kelvinstraat 11-13, 1446 TK PURMEREND, The Netherlands
tel: +31 20 630 4747
web: www.benjamins.com <http://www.benjamins.com/> 
XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/293509> (by 
email <>)

XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/3166594> (by 
email <>)
XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/3166594> (by 
email <applewebdata://604C0837-F9C5-41A7-BAE2-10B711B40E47>)
XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/293509> (by 
email <>)
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--
<Prev in Thread] Current Thread [Next in Thread>