xsl-list
[Top] [All Lists]

Re: [xsl] String hashing code

2007-12-14 01:08:39
I don't have a XSLT solution. With Saxon I use for a similar problem an extension which returns md5 hashes for the serialized content. Find the source code below. I call it from within XSLT with

           <xsl:variable name="serialized_content">
<xsl:value-of select="saxon:serialize(current-group()[1],'')"/>
           </xsl:variable>
           <xsl:variable name="hash">
               <xsl:value-of select="md5:md5($serialized_content)"/>
           </xsl:variable>

--- file md5.java ---
import java.util.*;
import java.io.*;
import java.security.*;

/* Saxon extension for generating unique hash values. */

public class Md5  {
   public static String hex(byte[] array) {
       StringBuffer sb = new StringBuffer();
       for (int i = 0; i < array.length; ++i) {
sb.append(Integer.toHexString((array[i] & 0xFF) | 0x100).toUpperCase().substring(1,3));
       }
       return sb.toString();
   }
public static String md5 (String message) throws NoSuchAlgorithmException, UnsupportedEncodingException {
       MessageDigest md = MessageDigest.getInstance("MD5");
       return hex (md.digest(message.getBytes("CP1252")));
   }
}


Deborah Pickett schrieb:
A challenge to the XSLT demigods...

I am processing a number of separate XML documents using an Ant <xslt>
task, pulling out the MathML that is embedded inside them into their own
XML files using xsl:result-document (where I render them using Batik).
I want to make sure that the result document names don't clash, but
because they are across several source files, generate-id() isn't going
to suffice.  There are thousands of source files, all with
English-sounding names spread across many directories.

I was thinking of hashing document-uri(/) to produce a probably-unique
string that I can then append generate-id(.) to.  I rejected
encode-for-uri() as producing strings that are too long, and for not
anonymizing the document uri enough.  All the hashing algorithms I know
(MD5, for instance) happen to be heavy on bitwise operations, and I feel
dirty doing bitwise operations with arithmetic.

I prefer not to escape to non-XSLT, because I am providing this as part
of a library that needs to run on almost any XSLT 2.0 platform.

Any clever ideas?

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>