xsl-list
[Top] [All Lists]

Re: [xsl] Diffing XML

2012-10-25 11:01:30
Then I can possibly deep-equal my way through the nested structures to be a 
bit more precise about where the changes are.


As I said in my previous message, the tool I provided you with finds
out all (detailed) differences between the two XML documents. It also
can be processed just with an XSLT 1.0 processor.


Cheers,
Dimitre



On Thu, Oct 25, 2012 at 8:40 AM, Emma Burrows 
<Emma(_dot_)Burrows(_at_)rpharms(_dot_)com> wrote:
Yes, that's how I thought it would be useful. They don't update every chapter 
every month, so this would just narrow things down. Then I can possibly 
deep-equal my way through the nested structures to be a bit more precise 
about where the changes are. When I'm feeling strong, I'll think about 
finding out what is different too!

The only issue I can see looming is that it'll be easy to work out what's new 
since I'll be processing the "new" document (no "old" topic with the same id 
= new record). I'll have to think about how to find out if something has been 
deleted too; probably easy enough for topics by comparing the list of ids for 
the two versions, but a bit more tricky for id-less elements.

I seem to have created myself a nice big job to do. ;)

Emma


-----Original Message-----
From: Michael Kay [mailto:mike(_at_)saxonica(_dot_)com]
Sent: 24 October 2012 16:28
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: Re: [xsl] Diffing XML

One of the drawbacks of deep-equal is that it doesn't tell you where the 
differences are; another is that it doesn't give you any control over how to 
do the comparison (Saxon's saxon:deep-equal() variant tries to remedy both 
problems). But it can certainly used as part of the solution, by quickly 
eliminating subtrees that don't need to be examined any further.

Michael Kay
Saxonica

On 24/10/2012 15:27, Emma Burrows wrote:
Thinking about it further - I'm wondering whether something like deep-equals 
might suffice. What the users apparently really want right now is to know 
which parts of the document have changed so they can concentrate on those 
when checking the output on a website. In which case, starting with 
top-level elements and iterating my way down through the children, I could 
in theory at the very least output "Something has changed in <p> number 3 in 
the topic entitled 'Topic Title'".

I realise there are many pitfalls ahead and of course the minute they
see it, they will say "oh, but can't you make it do X?", but if I can
convince them they don't need to know exactly what has changed (I'm an
optimist), that might help. Or is there an even better way? (Assuming
one were daft enough to take on such a project :)

Emma


-----Original Message-----
From: Emma Burrows [mailto:Emma(_dot_)Burrows(_at_)rpharms(_dot_)com]
Sent: 24 October 2012 14:48
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: RE: [xsl] Diffing XML

Thanks Michael,

Thanks for the response. Yes, I'm thinking doing it entirely myself might a 
bit too ambitious. The data is relatively stable at this point and gets 
updated once a month which should theoretically reduce the number of things 
to check for each time. But even so, I can tell diffing is an art.

DeltaXML does seem to offer some interesting options and it could probably 
be integrated into our CMS (given a chisel and a mallet - the CMS is getting 
a bit old), but I don't think we have any budget to buy another tool and it 
sounds as if the users have some very specific requirements (like exporting 
the list of user-friendly differences to an Excel spreadsheet!). So I was 
looking for ideas about how to tackle the problem just in case I do indeed 
need to implement it!

Emma


-----Original Message-----
From: Michael Kay [mailto:mike(_at_)saxonica(_dot_)com]
Sent: 24 October 2012 11:48
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: Re: [xsl] Diffing XML

In general differencing well is quite a challenge, e.g. handling an 
arbitrary number of inserted elements in either document,  addition or 
removal of "div" layers, combining/splitting of paragraphs, reformatted 
indentation, etc. Doing it better than a general-purpose product such as 
DeltaXML could turn out to be a project that will keep you busy for a while.

Michael Kay
Saxonica

On 24/10/2012 11:36, Emma Burrows wrote:
I have a requirement to produce an end-user-readable "checklist" of all the 
places where an XML file has changed since the last version, with custom 
explanations of what each difference is. I'm able to run diffs which are 
fine for my own purposes, but the end users need the differences spelled 
out more precisely in plain language (eg: "there is an extra paragraph 
here", "the text 'xyz' has changed", "the attribute 'audience' has been 
changed to 'book'" etc).

Being an XSLT developer, I'm thinking of using an XSLT stylesheet to work 
on the "new" version of the file, document() in the "old" version, and then 
compare the nodes in the "new" version to those in the "old" version, 
generating appropriate messages into an HTML output as I go along.

Does that sound like a reasonable approach? Are there existing tools
or examples that might do what I'm after? Any recommendations on the
best way of comparing individual nodes? I am planning to do this in
Oxygen 14 so the world is my oyster as far as XSLT is concerned. :)

Just looking for general suggestions to point me in the right direction. 
Thanks!


_____________________________________________________________________
_ This email has been scanned by the Symantec Email Security.cloud
service.
For more information please visit http://www.symanteccloud.com
_____________________________________________________________________
_

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com 
______________________________________________________________________

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




-- 
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they
write all patents, too? :)
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


<Prev in Thread] Current Thread [Next in Thread>