Hi Ian,
diffing normalized text output is a good approach in my experience.
However, if the 4.1 structures differ significantly from 1.7 as you say,
it might be a good idea to transform the 4.1 output back to 1.7 prior to
the diff. Or maybe not "transform it back to match the input exactly",
but only to such a degree that the text files will be the same if no
content was lost or duplicated.
Gerrit
On 22.01.2021 12:28, ian(_dot_)proudfoot(_at_)itp-x(_dot_)co(_dot_)uk wrote:
Hi everyone,
I am working on a project to convert several thousand SGML files (S1000D
1.7) into a more recent XML version (S1000D 4.1). My finished XSLT style
sheet does the job that is expected. However during the development I
did run into a problem where an error in the stylesheet allowed the
output to pass schema validation but by omitting some content! For me
that’s very bad news and I was lucky to notice it. Ultimately the final
output will be verified by the subject matter experts, but I really
don’t want to give them any reason to doubt the reliability of the
conversion.
This got me thinking about ways to verify the output text content
against the input despite significantly different structure. Is there an
established way to do that? If so what is it called and how well does it
work?
Perhaps it’s something that I should build into the XSLT as it is
written? Or perhaps it could be run as a post process batch comparison
operation?
My initial thought is to output normalized text from input and output
and compare the resulting text files…
I’ve searched the archives, but I probably don’t know the correct
terminology to get any useful results…
Thanks in advance for all responses.
Ian
Ian Proudfoot
Bembridge
Isle of Wight
United Kingdom
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--