Would the Content MathML spec help here?
On Mon, Jul 13, 2020 at 10:25 AM David Birnbaum djbpitt(_at_)gmail(_dot_)com
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
Dear XSL-list,
Roger's question invites a tangential one (for which reason I have
changed the subject line): How might we usefully think about representing
matrices in an XML environment?
This question is tangential because Roger's original question assumed
his hierarchical representation (/matrix/row/col), and asked about how to
operate over it efficiently. But because over the years I have represented
matrices in XML in three different ways, with no particular insights into
how to compare them from an engineering perspective, I would be grateful
for any thoughts other readers of this list might have about that
comparison. I am not asking about which implementation will be the quickest
when performing a particular operation because, as we are often reminded on
this list, the only way to answer that question (aside from the obvious,
like avoiding repeating an assignment unnecessarily inside a for-each,
using keys where appropriate, etc.) is to time the results. But a more
general engineering question might be how the following alternative
representations might differ in the operations they allow or support or
facilitate:
/matrix/row/col This was Roger's representation, and it mirrors the one
found in HTML tables.
A sequence of <cell row="m" col="n"> elements. This is economical for
sparse matrices (not relevant for the matrix addition about which Roger
inquired, but harmless there, and economical in other, sparse contexts),
and the values can be accessed efficiently with composite keys (where those
are supported, e.g., Saxon PE or EE), or by simulating composite keys
(likely to be less efficient, since it requires composing the simulated key
within invocations of the key() function).
Nested arrays, e.g., [ [1, 2], [2, 4] ]. This mirrors a common
representation of matrices in other programming languages, e.g., as nested
lists in Python.
I used #2 in
https://archive.xmlprague.cz/2020/files/xmlprague-2020-proceedings.pdf
and I am using #3 for matrix transposition and dot-product operations in my
upcoming Balisage presentation (see
https://www.balisage.net/2020/Program.html). As a general model, #2 may
be most appropriate to the data because it does not implement a hierarchy
where one may not be inherent in the meaning of the data, that is, it does
not regard columns as subordinate to rows or rows as subordinate to
columns. After all, matrix addition does not depend on the subordination of
rows to columns or vice versa, and dot product operations associate rows in
one matrix with columns in the other.
I wrote about table models in 2007 (see
http://conferences.idealliance.org/extreme/html/2007/Birnbaum01/EML2007Birnbaum01.html),
but that was before arrays were part of the XML environment, and my focus
then was primarily on the fit between the model and the reality, and less
on how different models might encourage or encumber different types of
operations. The addition of arrays to our arsenal invites a reconsideration
of both the modeling and the engineering question.
If the only differences are likely to be finding the model that is most
performative with the XSLT engine at my disposal (with the complication
that that engine may incorporate different optimizations today than it will
at its next release), then a decision might come down only to timing the
different options. But aside from actual timing differences with a specific
XSLT engine, are there inherent differences in what these models do and do
not support that might encourage us to favor one over another? For what
it's worth, I find the imposed hierarchy of #1 and #3 artificial and
distracting, and it took me longer to work out the logic for operations
over #3 than it did over #2. But I have no professional expertise in either
computer science or software engineering, so I would welcome insights from
those who might notice considerations that I overlook.
Best,
David
On Mon, Jul 13, 2020 at 5:19 AM Martin Honnen 
martin(_dot_)honnen(_at_)gmx(_dot_)de <
xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
Am 12.07.2020 um 23:53 schrieb Martin Honnen 
martin(_dot_)honnen(_at_)gmx(_dot_)de:
On 12.07.2020 22:47, Dr. Roger L Costello costello(_at_)mitre(_dot_)org 
wrote:
Hi Folks,
My application uses lots of matrix operations. I used the SAXON Tool
Profile (-TP) option to generate a web page that shows the
performance
of each of my matrix operations. I found that my matrix addition
function is taking an appallingly large amount of time. Below is my
matrix addition function. Do you have suggestions on ways to improve
its performance?  /Roger
And of course in XPath 3/XSLT 3 there is the "for-each-pair" function
made for tasks like this but I have no idea whether that would perform
better. And to write it solely as a function expression you need
XQuery
or the XPath 4 extension function introduced in Saxon 10 to create
elements.
Here is an XSLT function uses for-each-pair twice together with
saxon:new-element to construct the row and col elements:
     <xsl:function name="matrix:addition" as="element(Matrix)"
visibility="public">
         <xsl:param name="M" as="element(Matrix)"/>
         <xsl:param name="N" as="element(Matrix)"/>
         <Matrix>
             <xsl:sequence
                 select="
                     for-each-pair(
                       $M/row,
                       $N/row,
                       function ($row1, $row2) {
                         saxon:new-element(
                           'row',
                           for-each-pair(
                             $row1/col,
                             $row2/col,
                             function ($col1, $col2) {
                               saxon:new-element('col', $col1 + $col2)
                             }
                           )
                         )
                       }
                     )"
             />
         </Matrix>
     </xsl:function>
Needs Saxon 10 PE or EE and I haven't tested whether it performs any
better than the previous suggestions.
XSL-List info and archive
EasyUnsubscribe (by email)