Handling thousands of topics shouldn't be a problem; if there were
millions I would consider an XML database.
Search time in the document shouldn't be a problem if you can keep it
(and its indexes, whether xsl:key indexes or auto-generated Saxon
indexes) in memory. But repeated loading of the document from disk every
time it's needed could get very slow.
Michael Kay
Saxonica
On 14/10/2010 12:01 PM, Jacobus Reyneke wrote:
Good day,
I am trying to write a system on a pure XML data store. There are various
reasons for doing this, but the most important is that I am always transforming
the results, and because the system's data structure is dynamic an
hierarchical, so XML is a lovely fit.
One part of my data will be large vocabularies of data, like dictionaries, and
I would like to know from the experts if I'm going to run into trouble in the
long term and should rather move to a relational database solution with proper
indexing etc. I intend to use Saxon, simply because it's written in Java, it
supports XSLT 2.0 and Michael has a good history of sticking behind his product.
Other options may be using XML databases, but the visibility provided by free
standing XML files compared to an administrator console to a database is nice.
The data will look something like this:
<topic>
<name>hamburger</name>
<related-topics><topic-ref>food</topic-ref><topic-ref>dead-cows</topic-ref><topic-ref>health</topic-ref>
<keywords>burger, ketchup, mustard, hungry</keywords>
<description>Hamburgers are nice, but are not always good for your health. They are
especially bad for the health of the cow, but this is o.k. if you don't know the
cow</description>
</topic>
These topics will be built on the fly during chatroom conversations, so the
related-topics and keywords will not be known before hand. Yet, it's the
related-topics and keywords, that will be used on-the-fly to find matching
topics, and format them into diargrams and charts etc.
In a couple of month's time there will be thousands of topics, so I am looking for a way to
do this that will scale. Another problem is that some topics may be different in structure,
e.g. a topic on cars may have a<max-speed> element, while one on houses may have
a<price>, again another reason why a dynamic hierarchical data store makes more sense
than a traditional relational database.
If someone can give me some advice, or suggest an efficient search on something
like the keywords, I will be very grateful.
Kind regards,
Jacobus
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail:<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--