Re: [xsl] Unit tests in XSLT (was: Re: Controlling Debugging Messages)
2018-10-15 15:52:22
Hi Eliot,
Good to hear about your approach. It is, indeed, a good idea to unit
test libraries. I consider doing that, it is straighforward, the time
investment seems small and worthwhile. Assertions, as Mike mentioned,
will also come in handy.
Thanks for your answer!
Pieter
On 10/15/2018 10:03 PM, Eliot Kimber ekimber(_at_)contrext(_dot_)com wrote:
I have only created unit tests for function packages where the functions A)
could be tested easily and B) were used widely enough to warrant testing
outside the context of the specific transform that used them.
In the context of documentation-type transforms, as you say, it is usually
prohibitive to maintain the data that drives the tests because it is either
highly sensitive to details of the transform (e.g., how attributes might be
serialized or how formatting ends up being expressed or whatever) or the
input varies so much or both. For example, trying to implement a complex
publishing transform for documents where the markup details are being worked
out and are driven, in part, but what we learn from implementing the
publishing processes.
The DITA Open Toolkit project does maintain a set of tests that compare
expected result docs to actual results and that works pretty well but that's
also a very controlled environment where the input, in particular, is highly
controlled and the output result is stable (in particular, generating HTML
from DITA source, where there's much less variation in the generated result
than their might be with other transformation targets). But those test are
also time consuming to maintain and they definitely do not represent a full
test suite over the functionality of even just the HTML5 transform.
In my experience doing primarily publishing processing, it's more efficient
to use inspection and special-purpose test documents that can be easily
verified by inspection and then let normal documentation review processes
highlight problems.
In rare cases where the data is both complex and errors may not be caught by
normal inspection I will implement a post-transform evaluation process, for
example, ensuring all elements of a specific type in the input are reflected
in the output and not duplicated (assuming duplication would be an error).
Most projects I'm involved with tend to be under-resourced and time
constrained, so it's hard to build in the cost of more formal testing to the
project bids, even if the project might benefit in the longer term. That is,
I seldom have the luxury of applying the level of software engineering
attention that I would like to bring to my XSLT development.
Cheers,
Eliot
--
Eliot Kimber
http://contrext.com
On 10/15/18, 2:41 PM, "Pieter Masereeuw pieter(_at_)masereeuw(_dot_)nl"
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
Talking about debugging, I recently attended a webinar by Oxygen about
unit tests for XSLT (and Schematron). After watching it, I felt guilty
that my XSLT development practice is still not test-driven. Despite of
what was said in the webinar, it seems too much trouble and the thing
that most often requires modification of stylesheets is changes or
hitherto unforeseen constructs in the format of the input XML.
It would be interesting to know if there are readers on this list who
actually make use of units test during XSLT development and what their
experiences are.
Pieter Masereeuw
On 10/15/2018 07:28 PM, Eliot Kimber ekimber(_at_)contrext(_dot_)com
wrote:
> Yes, in general you want debug messages to be turned off, which is why
the doDebug parameter default is "false()" in my code.
>
> I didn't think of use-when="$DEBUG or true()"--that would work for a
lot of my cases but it doesn't handle the case where I want to turn on
debugging for all the templates that will get called in the course of handing
some specific input.
>
> So that suggests that my dynamic approach is what I need generally.
>
> Once the code is in place and working then it would be easier to set up
a set of debugging control variables that reflect different cases or code
paths I know I might need to debug in the future but during development that
doesn't really work because of course the code is in flux you don' t
necessarily know what will be of interest and what won't.
>
> It might work to have per-module static debug controls. I've moved
generally to using more smaller modules, usually one per distinct mode or set
of related modes, and that would make it more natural to have global
debugging for those modes. I'll have to think about that more.
>
> In practice my debugging pattern isn't a burden--it's something I do
whenever I set up a new template or add an apply-templates or next-match or
call-template but it's always felt like there should be a simpler way.
>
> Cheers,
>
> Eliot
> --
> Eliot Kimber
> http://contrext.com
>
>
> On 10/15/18, 12:11 PM, "Michael Kay mike(_at_)saxonica(_dot_)com"
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
>
> With static variables you can of course have multiple switches but
they will be statically scoped rather than dynamically scoped. You could use
multiple variables or you could use flags within a single variable
(use-when="contains($DEBUG_FLAGS, 'g')").
>
> I have to confess I'm not usually that organized. I tend to have a
single variable $DEBUG which is false, and then switch on individual debug
lines using use-when="$DEBUG or true()". I tend to find that debug statements
are rarely useful once you've solved the bug that they were invented for;
except in rare cases where you persistently have problems with some
particular intermediate result passed across a key interface in your
application - in which case there may be better approaches than xsl:message
to monitoring what's passed across that boundary.
>
> But I wouldn't recommend anyone to be as disorganised as me.
>
> Michael Kay
> Saxonica
>
> > On 15 Oct 2018, at 17:05, Eliot Kimber
ekimber(_at_)contrext(_dot_)com
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
> >
> > I was just about to post about this.
> >
> > In my XSLT 2 code I have historically used this pattern:
> >
> > <xsl:template match="foo">
> > <xsl:param name="doDebug" as="xs:boolean" tunnel="yes"
select="false()"/>
> >
> > <xsl:if test="$doDebug">
> > <xsl:message>+ [DEBUG] Handling <xsl:value-of
name="concat(name(..), '/', name(.))"/>...</xsl:message>
> >
> > <xsl:apply-templates>
> > <xsl:with-param name="doDebug" as="xsl:boolean" tunnel="yes"
select="$doDebug"/>
> > </xsl:apply-templates>
> > </xsl:template>
> >
> > This allows me to selectively turn debugging on and off in
specific parts of the code but does require this somewhat heavy weight code.
> >
> > With @use-when, can I get the same level of local control?
> >
> > That is, with the above, I can add:
> >
> > <xsl:variable name="doDebug" as="xs:Boolean" select="true()"/>
> >
> > In any block to turn debugging on just there.
> >
> > If I understand the implications of static variables allowed in
@use-when, the debugging switch is globally all-or-nothing, or at least
global within a given package.
> >
> > Is that correct?
> >
> > If that is correct, is there a better way to do the selective,
dynamically-controlled debug messaging shown above?
> >
> > Cheers,
> >
> > E.
> >
> > --
> > Eliot Kimber
> > http://contrext.com
> >
> >
> > On 10/15/18, 9:08 AM, "Michael Kay mike(_at_)saxonica(_dot_)com"
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
> >
> > These days you can do
> >
> > <xsl:message use-when="$DEBUG" ....>
> >
> > with $DEBUG defined as a static parameter.
> >
> > <xsl:param name="DEBUG" as="xs:boolean" static="true"
select="false()"/>
> >
> > No need for the run-time check with xsl:if.
> >
> > You can also use xsl:assert to define assertions. In Saxon,
assertion checking can be enabled from the command line using -ea.
> >
> > Michael Kay
> > Saxonica
> >
> >> On 15 Oct 2018, at 14:54, Dave Pawson
dave(_dot_)pawson(_at_)gmail(_dot_)com
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
> >>
> >> MIght even surround it with
> >> <xsl:if test="$debug">
> >>
> >> To ease insertion / removal when testing?
> >>
> >> HTH
> >> On Mon, 15 Oct 2018 at 14:31, Wendell Piez
wapiez(_at_)wendellpiez(_dot_)com
> >> <xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
> >>>
> >>> Eliot writes:
> >>>
> >>>> I also depend heavily on using messages to test my assumptions.
> >>>
> >>>> For example, I might do something like:
> >>>
> >>>> <xsl:message>+ [DEBUG] jpeg_few={$jpeg_few => string-join(',
')}</xsl:message>
> >>>> <xsl:message>+ [DEBUG] jpeg_many={$jpeg_many => string-join(',
')}</xsl:message>
> >>>
> >>> This is a key technique when developing XSLT. The language is
designed
> >>> to "fail gracefully" most of the time -- which puts the burden
on the
> >>> programmer to ensure things don't fail catastrophically. :-)
> >>>
> >>> Cheers, Wendell
> >>>
> >>> On Sun, Oct 14, 2018 at 7:10 PM Eliot Kimber
ekimber(_at_)contrext(_dot_)com
> >>> <xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com>
wrote:
> >>>>
> >>>> Looking at the XPath 3 Functions and Operators specification
and searching on "intersect" (hoping to also find "disjoint") I find this
discussion:
> >>>>
> >>>> D.4.2.3 eg:value-except
> >>>> eg:value-except( $arg1 as xs:anyAtomicType*,
> >>>> $arg2 as xs:anyAtomicType*) as xs:anyAtomicType*
> >>>> This function returns a sequence containing all the distinct
items that appear in $arg1 but not in $arg2, in an arbitrary order.
> >>>>
> >>>> XSLT implementation
> >>>>
> >>>> <xsl:function name="eg:value-except" as="xs:anyAtomicType*">
> >>>> <xsl:param name="arg1" as="xs:anyAtomicType*"/>
> >>>> <xsl:param name="arg2" as="xs:anyAtomicType*"/>
> >>>> <xsl:sequence
> >>>> select="fn:distinct-values($arg1[not(.=$arg2)])"/>
> >>>> </xsl:function>Which is in
https://www.w3.org/TR/xpath-functions-31/#other-functions (Appendix D).
> >>>>
> >>>> So basically
> >>>>
> >>>> distinct-values($jpeg_few[not(. = $jpeg_many)]
> >>>>
> >>>> Should give you the answer you seek.
> >>>>
> >>>> I agree with Mike that being obsessive about putting data
types on all variables and function return values (and templates when the
templates should return atomic types or specific element types) will help a
lot.
> >>>>
> >>>> If your code is working without types but failing with them it
means your code is "working" but probably not for the reasons you think.
> >>>>
> >>>> Working carefully through the stages of the expressions by
setting each intermediate result into variable will help a lot.
> >>>>
> >>>> I also depend heavily on using messages to test my assumptions.
> >>>>
> >>>> For example, I might do something like:
> >>>>
> >>>> <xsl:message>+ [DEBUG] jpeg_few={$jpeg_few => string-join(',
')}</xsl:message>
> >>>> <xsl:message>+ [DEBUG] jpeg_many={$jpeg_many => string-join(',
')}</xsl:message>
> >>>>
> >>>> Or if those lists are very long, use count() or get the first
n items or whatever to make it clear that you're working with the values you
think you are.
> >>>>
> >>>> Also, remember that <xsl:value-of> ({} in string result
contexts) is different from <xsl:sequence>, which returns the actual value,
not a string representation.
> >>>>
> >>>> For example, given a variable that is an attribute node,
value-of will return string value of the attribute but xsl:sequence will
return the attribute node and Saxon will serialize it as <attribute
name="foo" value="bar"> (or something similar to that.
> >>>>
> >>>> It's easy to accidently create a sequence of attributes when
what you wanted was a sequence of strings (or visa versa) and using
xsl:value-of can obscure that mistake.
> >>>>
> >>>> I've also started using the XQuery-required explicating
casting of values even though XSLT usually lets you get away with implicit
casting, because it makes it clearer to me what my intent was (and makes it
easier to copy XPath expressions into XQuery, if that's something you need to
do).
> >>>>
> >>>> Cheers,
> >>>>
> >>>> Eliot
> >>>> --
> >>>> Eliot Kimber
> >>>> http://contrext.com
> >>>>
> >>>>
> >>>> On 10/14/18, 3:53 PM, "Dave Lang
emaildavelang(_at_)gmail(_dot_)com"
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
> >>>>
> >>>>> That error can only come from an expression that calls
tokenize(). It's therefore clearly not your declaration of
jpgs_in_xml_not_directories that's at fault.
> >>>>
> >>>> Fair enough - but when I run the transformation without that
declaration
> >>>> everything works fine. Is there something I can do to the
variables that
> >>>> are included in it to make the declaration work?
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Wendell Piez | http://www.wendellpiez.com
> >>> XML | XSLT | electronic publishing
> >>> Eat Your Vegetables
> >>> _____oo_________o_o___ooooo____ooooooo_^
> >>>
> >>
> >>
> >>
> >> --
> >> Dave Pawson
> >> XSLT XSL-FO FAQ.
> >> Docbook FAQ.
> >>
> >
> >
> >
> >
>
>
>
>
>
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--
signature.asc
Description: OpenPGP digital signature
|
|