xsl-list
[Top] [All Lists]

Re: [xsl] is there a way to hash an element?

2016-06-13 05:12:18

The matching rules are that A and B are considered to be the same if and
only if they have the same descendant elements in the same document
order with each element in A having associated with it the same
attributes and attribute values as the corresponding element in B.
(There aren't any descendant text nodes, comments, or processing
instructions.)


There are a few gaps in that spec, e.g. it doesn't mention element names, but I 
think you could do something like this:

xsl:function name="f:hash-of-string" as="xs:integer"
  xsl:param name="in" as="xs:string"
  xsl:sequence select="sum(for $i in 1 to string-length($in) return ($i * 
string-to-codepoints(substring($in, $i, 1))))"

xsl:function name="f:hash-of-element" as="xs:integer"
  xsl:param name="in" as="element()"
  xsl:sequence select="f:hash-of-attributes($in/@*) + f:hash-of-children($in/*) 
+ f:hash-of-string(local-name($in))"

xsl:function name="f:hash-of-attributes" as="xs:integer"
  xsl:param name="in" as="attribute()*"
  xsl:sequence select="sum(for $a in $in return f:hash-of-string(local-name($a) 
* f:hash-of-string(string($a)))"

xsl:function name="f:hash-of-children" as="xs:integer"
  xsl:param name="in" as="element()*"
  xsl:sequence select="sum(for $i in 1 to count($in) return 
f:hash-of-element($in[$i]) * $i)"

xsl:function name="f:top-level-hash" as="xs:integer"
  xsl:param name="in" as="element()"
  xsl:sequence select="f:hash-of-children($in/*, 1)"

Of course, it's not guaranteed that if two elements have the same hash value, 
then they are "the same". So after grouping by hash value, you'll need to do an 
n^2 operation using deep-equal() (or a custom replacement) to eliminate false 
friends.



Michael Kay
Saxonica
  
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>