Firstly, I question the premise that XML should be designed to enable streamed
transformation. One could equally well argue that you should design it so it
doesn't need to be transformed at all. Transformation is only necessary because
the data isn't in the form you want it; designing it so that it can easily be
transformed into the form you want it seems a little odd. Unless perhaps you
are thinking of designing the intermediate formats in a processing pipeline.
1. Use lots of attributes. Store in them the data needed for processing the
node.
Certainly for data that can conveniently be represented as attributes, this
will make streamed processing easier. But don't overdo it.
2. Have one child element only.
No, if there are two things that should naturally be represented as child
elements, then represent them that way. There are plenty of techniques still
available for streamed processing: accumulators, xsl:iterator, fold-left,
xsl:fork.
So, to enable efficient stream processing, design XML like this:
<root a="..." b="..." c="...">
<node d="..." e="..." f="...">
<node g="..." h="..." i="...">
<node j="..." k="..." l="...">
<node m="..." n="..." o="...">
<node p="..." q="..." r="...">
...
</node>
</node>
</node>
</node>
</node>
</root>
This results in a massively deep tree. For Gigabyte-sized XML files, the
nesting could be a billion levels deep (or more).
No, such a design is completely bizarre and defeats the whole purpose of
streaming, which is to reduce memory use.
I would add some more important design criteria. Put metadata and reference
information (stuff that's needed for reference throughout document processing)
at the start of the document rather than the end, or in a separate document.
Use hierarchic nesting for relationships rather than id/idref style pointers
(even perhaps if it means holding the data redundantly).
Michael Kay
Saxonica
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--