xsl-list
[Top] [All Lists]

RE: [xsl] RE: String conversion problem when string is large

2012-03-22 09:35:14
-----Original Message-----
From: Scott Trenda [mailto:Scott(_dot_)Trenda(_at_)oati(_dot_)net]
Sent: Wednesday, March 21, 2012 6:48 PM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] RE: String conversion problem when string is large

Kevin,

The problem with your chunked template is that it isn't
really bisection. You don't get true O(log n) performance -
you only get O(n / x) performance, x being dependent upon
your chunk size. Your template would need to be recursive to
get the real benefit of bisection. Here it is:

<xsl:template name="HexToDec">
  <xsl:param name="HexData" />
  <xsl:param name="Hex" select="'0123456789ABCDEF'" />
  <xsl:choose>
    <xsl:when test="contains($HexData, ',')">
      <xsl:variable name="midpoint"
select="floor(string-length($HexData) div 2) - 1" />
      <xsl:variable name="half1" select="substring($HexData,
1, $midpoint)" />
      <xsl:variable name="half2" select="substring($HexData,
$midpoint + 1)" />
      <xsl:call-template name="HexToDec">
        <xsl:with-param name="HexData" select="concat($half1,
substring-before($half2, ','))" />
        <xsl:with-param name="Hex" select="$Hex" />
      </xsl:call-template>
      <xsl:text>,</xsl:text>
      <xsl:call-template name="HexToDec">
        <xsl:with-param name="HexData"
select="substring-after($half2, ',')" />
        <xsl:with-param name="Hex" select="$Hex" />
      </xsl:call-template>
    </xsl:when>
    <xsl:when test="starts-with($HexData, '0x')">
      <xsl:value-of
select="string-length(substring-before($Hex,
substring($HexData, 3, 1))) * 16 +
string-length(substring-before($Hex, substring($HexData, 4, 1)))" />
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$HexData" />
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

Try it out, please - I'm actually very curious to see how the
performance and memory usage stand up against the other
templates in the various processors. You should be able to
use it with any processor now, too.

~ Scott

I liked your idea and did look at your code.  I confess that when I looked at 
your binary splitter, I decided to take the low road to get a solution faster 
rather than more elegant.  When I started to rewrite and model the results on 
paper with a small data set I had a hunch that the edge cases could take time.  
I'd been held up by the crashes on the Windows systems for too long and needed 
to move on (read management breathing down my neck), and as I didn't have a 
Java engine over on those systems (won't go into why - but maybe that will 
change eventually now).  The chunker I wrote in minutes and it worked out of 
the box.  That said, I'm tempted to follow up on this idea since it is tighter 
and since I've just realized that with a very minor change, either your or my 
version could change the "Hex" to "AnyBase" without much more effort (not that 
I need such a beastie).  This conversion isn't done frequently.  I appreciate 
your follow up, so...

With saxonixa:

Execution time: 1.152s (1152ms)
Memory used: 19613968

I'm not sure what's going on with saxonica's memory reporting.  Repeat runs 
give variances like 15895712 - 19613968.  Of course, it strikes me as 
interesting that I'm using 15-20MB (or multiple GB in the beginning) to convert 
a 1MB file.

Xsltproc: 2.23s
Sablotron: 7.21s

Of course, it also is a bit broken.

awk 'BEGIN { FS=","; } /AppCompatCache/ { print $9 " vs " NF-9 }' 
idiffout.saxonhe
53392239 vs 53391
53392239 vs 53391
-----^
It drops commas.  It should have built a file 260537, but instead it built one 
260512.  It has trouble with some midpoint issues.  For example, see how it 
dropped a comma:

-11,32768,2147483650,0,SYSTEM\ControlSet002\Services\lanmanserver\parameters,0,Guid,3,16,26,35,206,210,171,128,91,74,159,229,42,166,232,8,248,218
+11,32768,2147483650,0,SYSTEM\ControlSet002\Services\lanmanserver\parameters,0,Guid,3,1626,35,206,210,171,128,91,74,159,229,42,166,232,8,248,218

I'll certainly hang on to the idea here, though I probably will not iron out 
the bugs immediately.

This just goes to show what I said before... what an adventure.  I love 
learning.

Kevin Bulgrien

This message and/or attachments may include information subject to GD Corporate 
Policy 07-105 and is intended to be accessed only by authorized personnel of 
General Dynamics and approved service providers.  Use, storage and transmission 
are governed by General Dynamics and its policies. Contractual restrictions 
apply to third parties.  Recipients should refer to the policies or contract to 
determine proper handling.  Unauthorized review, use, disclosure or 
distribution is prohibited.  If you are not an intended recipient, please 
contact the sender and destroy all copies of the original message.

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


<Prev in Thread] Current Thread [Next in Thread>