xsl-list
[Top] [All Lists]

Re: [xsl] Searching for values in XML using XSL using Saxon

2010-10-14 07:54:56
Handling thousands of topics shouldn't be a problem; if there were millions I would consider an XML database.

Search time in the document shouldn't be a problem if you can keep it (and its indexes, whether xsl:key indexes or auto-generated Saxon indexes) in memory. But repeated loading of the document from disk every time it's needed could get very slow.

Michael Kay
Saxonica

On 14/10/2010 12:01 PM, Jacobus Reyneke wrote:
Good day,

I am trying to write a system on a pure XML data store. There are various 
reasons for doing this, but the most important is that I am always transforming 
the results, and because the system's data structure is dynamic an 
hierarchical, so XML is a lovely fit.

One part of my data will be large vocabularies of data, like dictionaries, and 
I would like to know from the experts if I'm going to run into trouble in the 
long term and should rather move to a relational database solution with proper 
indexing etc. I intend to use Saxon, simply because it's written in Java, it 
supports XSLT 2.0 and Michael has a good history of sticking behind his product.

Other options may be using XML databases, but the visibility provided by free 
standing XML files compared to an administrator console to a database is nice.

The data will look something like this:

<topic>
<name>hamburger</name>
<related-topics><topic-ref>food</topic-ref><topic-ref>dead-cows</topic-ref><topic-ref>health</topic-ref>
<keywords>burger, ketchup, mustard, hungry</keywords>
<description>Hamburgers are nice, but are not always good for your health. They are 
especially bad for the health of the cow, but this is o.k. if you don't know the 
cow</description>
</topic>

These topics will be built on the fly during chatroom conversations, so the 
related-topics and keywords will not be known before hand. Yet, it's the 
related-topics and keywords, that will be used on-the-fly to find matching 
topics, and format them into diargrams and charts etc.

In a couple of month's time there will be thousands of topics, so I am looking for a way to 
do this that will scale. Another problem is that some topics may be different in structure, 
e.g. a topic on cars may have a<max-speed>  element, while one on houses may have 
a<price>, again another reason why a dynamic hierarchical data store makes more sense 
than a traditional relational database.

If someone can give me some advice, or suggest an efficient search on something 
like the keywords, I will be very grateful.

Kind regards,
Jacobus
--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail:<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>