xsl-list
[Top] [All Lists]

Re: [xsl] Searching for values in XML using XSL using Saxon

2010-10-14 07:58:29
2010/10/14 Michael Kay <mike(_at_)saxonica(_dot_)com>:
 Handling thousands of topics shouldn't be a problem; if there were millions
I would consider an XML database.

http://exist-db.org - very good case


Search time in the document shouldn't be a problem if you can keep it (and
its indexes, whether xsl:key indexes or auto-generated Saxon indexes) in
memory. But repeated loading of the document from disk every time it's
needed could get very slow.

Michael Kay
Saxonica

On 14/10/2010 12:01 PM, Jacobus Reyneke wrote:

Good day,

I am trying to write a system on a pure XML data store. There are various
reasons for doing this, but the most important is that I am always
transforming the results, and because the system's data structure is dynamic
an hierarchical, so XML is a lovely fit.

One part of my data will be large vocabularies of data, like dictionaries,
and I would like to know from the experts if I'm going to run into trouble
in the long term and should rather move to a relational database solution
with proper indexing etc. I intend to use Saxon, simply because it's written
in Java, it supports XSLT 2.0 and Michael has a good history of sticking
behind his product.

Other options may be using XML databases, but the visibility provided by
free standing XML files compared to an administrator console to a database
is nice.

The data will look something like this:

<topic>
<name>hamburger</name>

<related-topics><topic-ref>food</topic-ref><topic-ref>dead-cows</topic-ref><topic-ref>health</topic-ref>
<keywords>burger, ketchup, mustard, hungry</keywords>
<description>Hamburgers are nice, but are not always good for your health.
They are especially bad for the health of the cow, but this is o.k. if you
don't know the cow</description>
</topic>

These topics will be built on the fly during chatroom conversations, so
the related-topics and keywords will not be known before hand. Yet, it's the
related-topics and keywords, that will be used on-the-fly to find matching
topics, and format them into diargrams and charts etc.

In a couple of month's time there will be thousands of topics, so I am
looking for a way to do this that will scale. Another problem is that some
topics may be different in structure, e.g. a topic on cars may have
a<max-speed>  element, while one on houses may have a<price>, again another
reason why a dynamic hierarchical data store makes more sense than a
traditional relational database.

If someone can give me some advice, or suggest an efficient search on
something like the keywords, I will be very grateful.

Kind regards,
Jacobus
--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or 
e-mail:<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>