Re: [xsl] Unit tests in XSLT (was: Re: Controlling Debugging Messages)

Hi Eliot,

Good to hear about your approach. It is, indeed, a good idea to unit
test libraries. I consider doing that, it is straighforward, the time
investment seems small and worthwhile. Assertions, as Mike mentioned,
will also come in handy.

Thanks for your answer!

Pieter


On 10/15/2018 10:03 PM, Eliot Kimber ekimber(_at_)contrext(_dot_)com wrote:

I have only created unit tests for function packages where the functions A) 
could be tested easily and B) were used widely enough to warrant testing 
outside the context of the specific transform that used them.

In the context of documentation-type transforms, as you say, it is usually 
prohibitive to maintain the data that drives the tests because it is either 
highly sensitive to details of the transform (e.g., how attributes might be 
serialized or how formatting ends up being expressed or whatever) or the 
input varies so much or both. For example, trying to implement a complex 
publishing transform for documents where the markup details are being worked 
out and are driven, in part, but what we learn from implementing the 
publishing processes.

The DITA Open Toolkit project does maintain a set of tests that compare 
expected result docs to actual results and that works pretty well but that's 
also a very controlled environment where the input, in particular, is highly 
controlled and the output result is stable (in particular, generating HTML 
from DITA source, where there's much less variation in the generated result 
than their might be with other transformation targets). But those test are 
also time consuming to maintain and they definitely do not represent a full 
test suite over the functionality of even just the HTML5 transform.

In my experience doing primarily publishing processing, it's more efficient 
to use inspection and special-purpose test documents that can be easily 
verified by inspection and then let normal documentation review processes 
highlight problems.

In rare cases where the data is both complex and errors may not be caught by 
normal inspection I will implement a post-transform evaluation process, for 
example, ensuring all elements of a specific type in the input are reflected 
in the output and not duplicated (assuming duplication would be an error).

Most projects I'm involved with tend to be under-resourced and time 
constrained, so it's hard to build in the cost of more formal testing to the 
project bids, even if the project might benefit in the longer term. That is, 
I seldom have the luxury of applying the level of software engineering 
attention that I would like to bring to my XSLT development.

Cheers,

Eliot

--
Eliot Kimber
http://contrext.com
 

On 10/15/18, 2:41 PM, "Pieter Masereeuw pieter(_at_)masereeuw(_dot_)nl" 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:

    Talking about debugging, I recently attended a webinar by Oxygen about
    unit tests for XSLT (and Schematron). After watching it, I felt guilty
    that my XSLT development practice is still not test-driven. Despite of
    what was said in the webinar, it seems too much trouble and the thing
    that most often requires modification of stylesheets is changes or
    hitherto unforeseen constructs in the format of the input XML.
    
    It would be interesting to know if there are readers on this list who
    actually make use of units test during XSLT development and what their
    experiences are.
    
    Pieter Masereeuw
    
    
    On 10/15/2018 07:28 PM, Eliot Kimber ekimber(_at_)contrext(_dot_)com 
wrote:
    > Yes, in general you want debug messages to be turned off, which is why 
the doDebug parameter default is "false()" in my code.
    >
    > I didn't think of use-when="$DEBUG or true()"--that would work for a 
lot of my cases but it doesn't handle the case where I want to turn on 
debugging for all the templates that will get called in the course of handing 
some specific input.
    >
    > So that suggests that my dynamic approach is what I need generally.
    >
    > Once the code is in place and working then it would be easier to set up 
a set of debugging control variables that reflect different cases or code 
paths I know I might need to debug in the future but during development that 
doesn't really work because of course the code is in flux you don' t 
necessarily know what will be of interest and what won't.
    >
    > It might work to have per-module static debug controls. I've moved 
generally to using more smaller modules, usually one per distinct mode or set 
of related modes, and that would make it more natural to have global 
debugging for those modes. I'll have to think about that more. 
    >
    > In practice my debugging pattern isn't a burden--it's something I do 
whenever I set up a new template or add an apply-templates or next-match or 
call-template but it's always felt like there should be a simpler way. 
    >
    > Cheers,
    >
    > Eliot
    > --
    > Eliot Kimber
    > http://contrext.com
    >  
    >
    > On 10/15/18, 12:11 PM, "Michael Kay mike(_at_)saxonica(_dot_)com" 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
    >
    >     With static variables you can of course have multiple switches but 
they will be statically scoped rather than dynamically scoped. You could use 
multiple variables or you could use flags within a single variable 
(use-when="contains($DEBUG_FLAGS, 'g')").
    >     
    >     I have to confess I'm not usually that organized. I tend to have a 
single variable $DEBUG which is false, and then switch on individual debug 
lines using use-when="$DEBUG or true()". I tend to find that debug statements 
are rarely useful once you've solved the bug that they were invented for; 
except in rare cases where you persistently have problems with some 
particular intermediate result passed across a key interface in your 
application - in which case there may be better approaches than xsl:message 
to monitoring what's passed across that boundary.
    >     
    >     But I wouldn't recommend anyone to be as disorganised as me.
    >     
    >     Michael Kay
    >     Saxonica
    >     
    >     > On 15 Oct 2018, at 17:05, Eliot Kimber 
ekimber(_at_)contrext(_dot_)com 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
    >     > 
    >     > I was just about to post about this.
    >     > 
    >     > In my XSLT 2 code I have historically used this pattern:
    >     > 
    >     > <xsl:template match="foo">
    >     >  <xsl:param name="doDebug" as="xs:boolean" tunnel="yes" 
select="false()"/>
    >     > 
    >     >  <xsl:if test="$doDebug">
    >     >   <xsl:message>+ [DEBUG] Handling <xsl:value-of 
name="concat(name(..), '/', name(.))"/>...</xsl:message>
    >     > 
    >     >  <xsl:apply-templates>
    >     >    <xsl:with-param name="doDebug" as="xsl:boolean" tunnel="yes" 
select="$doDebug"/>
    >     >  </xsl:apply-templates>
    >     > </xsl:template>
    >     > 
    >     > This allows me to selectively turn debugging on and off in 
specific parts of the code but does require this somewhat heavy weight code.
    >     > 
    >     > With @use-when, can I get the same level of local control?
    >     > 
    >     > That is, with the above, I can add:
    >     > 
    >     > <xsl:variable name="doDebug" as="xs:Boolean" select="true()"/>
    >     > 
    >     > In any block to turn debugging on just there. 
    >     > 
    >     > If I understand the implications of static variables allowed in 
@use-when, the debugging switch is globally all-or-nothing, or at least 
global within a given package.
    >     > 
    >     > Is that correct?
    >     > 
    >     > If that is correct, is there a better way to do the selective, 
dynamically-controlled debug messaging shown above?
    >     > 
    >     > Cheers,
    >     > 
    >     > E.
    >     > 
    >     > --
    >     > Eliot Kimber
    >     > http://contrext.com
    >     > 
    >     > 
    >     > On 10/15/18, 9:08 AM, "Michael Kay mike(_at_)saxonica(_dot_)com" 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
    >     > 
    >     >    These days you can do
    >     > 
    >     >    <xsl:message use-when="$DEBUG" ....>
    >     > 
    >     >    with $DEBUG defined as a static parameter. 
    >     > 
    >     >    <xsl:param name="DEBUG" as="xs:boolean" static="true" 
select="false()"/>
    >     > 
    >     >    No need for the run-time check with xsl:if.
    >     > 
    >     >    You can also use xsl:assert to define assertions. In Saxon, 
assertion checking can be enabled from the command line using -ea.
    >     > 
    >     >    Michael Kay
    >     >    Saxonica 
    >     > 
    >     >> On 15 Oct 2018, at 14:54, Dave Pawson 
dave(_dot_)pawson(_at_)gmail(_dot_)com 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
    >     >> 
    >     >> MIght even surround it with
    >     >> <xsl:if test="$debug">
    >     >> 
    >     >> To ease insertion / removal when testing?
    >     >> 
    >     >> HTH
    >     >> On Mon, 15 Oct 2018 at 14:31, Wendell Piez 
wapiez(_at_)wendellpiez(_dot_)com
    >     >> <xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
    >     >>> 
    >     >>> Eliot writes:
    >     >>> 
    >     >>>> I also depend heavily on using messages to test my assumptions.
    >     >>> 
    >     >>>> For example, I might do something like:
    >     >>> 
    >     >>>> <xsl:message>+ [DEBUG] jpeg_few={$jpeg_few => string-join(', 
')}</xsl:message>
    >     >>>> <xsl:message>+ [DEBUG] jpeg_many={$jpeg_many => string-join(', 
')}</xsl:message>
    >     >>> 
    >     >>> This is a key technique when developing XSLT. The language is 
designed
    >     >>> to "fail gracefully" most of the time -- which puts the burden 
on the
    >     >>> programmer to ensure things don't fail catastrophically. :-)
    >     >>> 
    >     >>> Cheers, Wendell
    >     >>> 
    >     >>> On Sun, Oct 14, 2018 at 7:10 PM Eliot Kimber 
ekimber(_at_)contrext(_dot_)com
    >     >>> <xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> 
wrote:
    >     >>>> 
    >     >>>> Looking at the XPath 3 Functions and Operators specification 
and searching on "intersect" (hoping to also find "disjoint") I find this 
discussion:
    >     >>>> 
    >     >>>> D.4.2.3 eg:value-except
    >     >>>> eg:value-except(        $arg1    as xs:anyAtomicType*,
    >     >>>> $arg2    as xs:anyAtomicType*) as xs:anyAtomicType*
    >     >>>> This function returns a sequence containing all the distinct 
items that appear in $arg1 but not in $arg2, in an arbitrary order.
    >     >>>> 
    >     >>>> XSLT implementation
    >     >>>> 
    >     >>>> <xsl:function name="eg:value-except" as="xs:anyAtomicType*">
    >     >>>> <xsl:param name="arg1" as="xs:anyAtomicType*"/>
    >     >>>> <xsl:param name="arg2" as="xs:anyAtomicType*"/>
    >     >>>> <xsl:sequence
    >     >>>>    select="fn:distinct-values($arg1[not(.=$arg2)])"/>
    >     >>>> </xsl:function>Which is in 
https://www.w3.org/TR/xpath-functions-31/#other-functions (Appendix D).
    >     >>>> 
    >     >>>> So basically
    >     >>>> 
    >     >>>> distinct-values($jpeg_few[not(. = $jpeg_many)]
    >     >>>> 
    >     >>>> Should give you the answer you seek.
    >     >>>> 
    >     >>>> I agree with Mike that being obsessive about putting data 
types on all variables and function return values (and templates when the 
templates should return atomic types or specific element types) will help a 
lot.
    >     >>>> 
    >     >>>> If your code is working without types but failing with them it 
means your code is "working" but probably not for the reasons you think.
    >     >>>> 
    >     >>>> Working carefully through the stages of the expressions by 
setting each intermediate result into variable will help a lot.
    >     >>>> 
    >     >>>> I also depend heavily on using messages to test my assumptions.
    >     >>>> 
    >     >>>> For example, I might do something like:
    >     >>>> 
    >     >>>> <xsl:message>+ [DEBUG] jpeg_few={$jpeg_few => string-join(', 
')}</xsl:message>
    >     >>>> <xsl:message>+ [DEBUG] jpeg_many={$jpeg_many => string-join(', 
')}</xsl:message>
    >     >>>> 
    >     >>>> Or if those lists are very long, use count() or get the first 
n items or whatever to make it clear that you're working with the values you 
think you are.
    >     >>>> 
    >     >>>> Also, remember that <xsl:value-of> ({} in string result 
contexts) is different from <xsl:sequence>, which returns the actual value, 
not a string representation.
    >     >>>> 
    >     >>>> For example, given a variable that is an attribute node, 
value-of will return string value of the attribute but xsl:sequence will 
return the attribute node and Saxon will serialize it as <attribute 
name="foo" value="bar"> (or something similar to that.
    >     >>>> 
    >     >>>> It's easy to accidently create a sequence of attributes when 
what you wanted was a sequence of strings (or visa versa) and using 
xsl:value-of can obscure that mistake.
    >     >>>> 
    >     >>>> I've also started using the XQuery-required explicating 
casting of values even though XSLT usually lets you get away with implicit 
casting, because it makes it clearer to me what my intent was (and makes it 
easier to copy XPath expressions into XQuery, if that's something you need to 
do).
    >     >>>> 
    >     >>>> Cheers,
    >     >>>> 
    >     >>>> Eliot
    >     >>>> --
    >     >>>> Eliot Kimber
    >     >>>> http://contrext.com
    >     >>>> 
    >     >>>> 
    >     >>>> On 10/14/18, 3:53 PM, "Dave Lang 
emaildavelang(_at_)gmail(_dot_)com" 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
    >     >>>> 
    >     >>>>> That error can only come from an expression that calls 
tokenize(). It's therefore clearly not your declaration of 
jpgs_in_xml_not_directories that's at fault.
    >     >>>> 
    >     >>>>   Fair enough - but when I run the transformation without that 
declaration
    >     >>>>   everything works fine. Is there something I can do to the 
variables that
    >     >>>>   are included in it to make the declaration work?
    >     >>>> 
    >     >>>> 
    >     >>>> 
    >     >>>> 
    >     >>> 
    >     >>> 
    >     >>> 
    >     >>> --
    >     >>> Wendell Piez | http://www.wendellpiez.com
    >     >>> XML | XSLT | electronic publishing
    >     >>> Eat Your Vegetables
    >     >>> _____oo_________o_o___ooooo____ooooooo_^
    >     >>> 
    >     >> 
    >     >> 
    >     >> 
    >     >> -- 
    >     >> Dave Pawson
    >     >> XSLT XSL-FO FAQ.
    >     >> Docbook FAQ.
    >     >> 
    >     > 
    >     > 
    >     > 
    >     > 
    >     
    >     
    >     
    > 
    >

--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

signature.asc
Description: OpenPGP digital signature