xsl-list
[Top] [All Lists]

Re: [xsl] Modeling matrices in an XML environment

2020-07-13 13:25:50
Would the Content MathML spec help here?

On Mon, Jul 13, 2020 at 10:25 AM David Birnbaum djbpitt(_at_)gmail(_dot_)com
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:

Dear XSL-list,

Roger's question invites a tangential one (for which reason I have changed 
the subject line): How might we usefully think about representing matrices in 
an XML environment?

This question is tangential because Roger's original question assumed his 
hierarchical representation (/matrix/row/col), and asked about how to operate 
over it efficiently. But because over the years I have represented matrices 
in XML in three different ways, with no particular insights into how to 
compare them from an engineering perspective, I would be grateful for any 
thoughts other readers of this list might have about that comparison. I am 
not asking about which implementation will be the quickest when performing a 
particular operation because, as we are often reminded on this list, the only 
way to answer that question (aside from the obvious, like avoiding repeating 
an assignment unnecessarily inside a for-each, using keys where appropriate, 
etc.) is to time the results. But a more general engineering question might 
be how the following alternative representations might differ in the 
operations they allow or support or facilitate:

/matrix/row/col This was Roger's representation, and it mirrors the one found 
in HTML tables.
A sequence of <cell row="m" col="n"> elements. This is economical for sparse 
matrices (not relevant for the matrix addition about which Roger inquired, 
but harmless there, and economical in other, sparse contexts), and the values 
can be accessed efficiently with composite keys (where those are supported, 
e.g., Saxon PE or EE), or by simulating composite keys (likely to be less 
efficient, since it requires composing the simulated key within invocations 
of the key() function).
Nested arrays, e.g., [ [1, 2], [2, 4] ]. This mirrors a common representation 
of matrices in other programming languages, e.g., as nested lists in Python.


I used #2 in 
https://archive.xmlprague.cz/2020/files/xmlprague-2020-proceedings.pdf and I 
am using #3 for matrix transposition and dot-product operations in my 
upcoming Balisage presentation (see 
https://www.balisage.net/2020/Program.html). As a general model, #2 may be 
most appropriate to the data because it does not implement a hierarchy where 
one may not be inherent in the meaning of the data, that is, it does not 
regard columns as subordinate to rows or rows as subordinate to columns. 
After all, matrix addition does not depend on the subordination of rows to 
columns or vice versa, and dot product operations associate rows in one 
matrix with columns in the other.

I wrote about table models in 2007 (see 
http://conferences.idealliance.org/extreme/html/2007/Birnbaum01/EML2007Birnbaum01.html),
 but that was before arrays were part of the XML environment, and my focus 
then was primarily on the fit between the model and the reality, and less on 
how different models might encourage or encumber different types of 
operations. The addition of arrays to our arsenal invites a reconsideration 
of both the modeling and the engineering question.

If the only differences are likely to be finding the model that is most 
performative with the XSLT engine at my disposal (with the complication that 
that engine may incorporate different optimizations today than it will at its 
next release), then a decision might come down only to timing the different 
options. But aside from actual timing differences with a specific XSLT 
engine, are there inherent differences in what these models do and do not 
support that might encourage us to favor one over another? For what it's 
worth, I find the imposed hierarchy of #1 and #3 artificial and distracting, 
and it took me longer to work out the logic for operations over #3 than it 
did over #2. But I have no professional expertise in either computer science 
or software engineering, so I would welcome insights from those who might 
notice considerations that I overlook.

Best,

David

On Mon, Jul 13, 2020 at 5:19 AM Martin Honnen 
martin(_dot_)honnen(_at_)gmx(_dot_)de 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:

Am 12.07.2020 um 23:53 schrieb Martin Honnen 
martin(_dot_)honnen(_at_)gmx(_dot_)de:
On 12.07.2020 22:47, Dr. Roger L Costello costello(_at_)mitre(_dot_)org 
wrote:
Hi Folks,

My application uses lots of matrix operations. I used the SAXON Tool
Profile (-TP) option to generate a web page that shows the performance
of each of my matrix operations. I found that my matrix addition
function is taking an appallingly large amount of time. Below is my
matrix addition function. Do you have suggestions on ways to improve
its performance?  /Roger

And of course in XPath 3/XSLT 3 there is the "for-each-pair" function
made for tasks like this but I have no idea whether that would perform
better. And to write it solely as a function expression you need XQuery
or the XPath 4 extension function introduced in Saxon 10 to create
elements.


Here is an XSLT function uses for-each-pair twice together with
saxon:new-element to construct the row and col elements:


     <xsl:function name="matrix:addition" as="element(Matrix)"
visibility="public">
         <xsl:param name="M" as="element(Matrix)"/>
         <xsl:param name="N" as="element(Matrix)"/>
         <Matrix>
             <xsl:sequence
                 select="
                     for-each-pair(
                       $M/row,
                       $N/row,
                       function ($row1, $row2) {
                         saxon:new-element(
                           'row',
                           for-each-pair(
                             $row1/col,
                             $row2/col,
                             function ($col1, $col2) {
                               saxon:new-element('col', $col1 + $col2)
                             }
                           )
                         )
                       }
                     )"
             />
         </Matrix>
     </xsl:function>



Needs Saxon 10 PE or EE and I haven't tested whether it performs any
better than the previous suggestions.

XSL-List info and archive
EasyUnsubscribe (by email)
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>