[xquery-talk] Finding a XML-Database to fit our needs

Torsten Grust grust at in.tum.de
Sat Dec 15 18:17:08 PST 2007


Hi Johan,

your shopping list appears to perfectly fit the capabilities of DB2 V9
and its built-in pureXML XQuery processor.  This particularly relates
to your requirement to host extensive collections of moderately sixed
XML documents.  DB2 V9 is definitely worth a look, I'd say.

	http://www.ibm.com/developerworks/wikis/display/db2xml

You ask for a ``native XML database'', and while I do not believe that  
the
native vs. alien(?) categorization makes much sense(*), DB2 V9's  
internals
(query processor and storage engine) have been enhanced to include XML-
and XQuery-specific query operators as well as storage and index  
structures.
Does this make the originally purely relational DB2 database kernel a
native XML database...?  You decide.

Cheers,
    --Torsten

(*) There's no ``native'' representation of XML inside computers,  
besides
the serialized XML text, maybe.  I don't see how a DOM or other  
pointer-based
representation of an XML instance is in any way ``more native'' than a
tabular encoding, for example.  Uh, I start to sound like Dana  
Florescu... ;-)


On Dec 15, 2007, at 16:30, Johan Mörén wrote:

> Hi all,
>
> I'm new to the list and work for a company in Stockholm, Sweden.
>
> We are currently evaluating a move from storing our data in a RDMBS  
> (Oracle 10g) to storing it as native XML. The reason for doing this  
> is that all communication to the persistence layer is done via SOAP  
> and we believe we can save a lot of effort and time if we persist  
> our data in the same format as we communicate it to the outside world.
>
> We are looking for a solution that can handle approximately 16 000  
> 000 documents ranging from 50 to 200 KB in size. About 5k to 20k  
> documents will be updated daily. The documents are all derived from  
> the same base type and are described by a common schema. There are 5  
> sub-types that could be split into different collections where the  
> largest, in terms of number of documents, would be about 8-9 million  
> in size.
>
> Practically all documents have relationships described to documents  
> belonging both to their own type but also to the other types so  
> navigation of these relationships must be possible for querying  
> purposes.
>
> The documents are very data centric, containing very little free  
> text. But some fields will need to be backed by a free-text-index  
> for querying. Since operators will work online with the data, query  
> times will need to be reasonably fast for not to complex queries.
>
> Apart from the above. The database should support:
>
> * Concurrent inserts and updates.
> * XQuery 1.0 support.
> * Any fragmentation of the documents (to handle the size) should be  
> transparently handled by the database.
> * Both commercial and open source alternatives are of interest.
>
> Any input, experiences and pointers on where to look would be very  
> much appreciated.
>
> Cheers!
>
> /Johan
>
> --
> "You can't always write a chord ugly enough to say what you want to  
> say,
> so sometimes you have to rely on a giraffe filled with whipped  
> cream." - Frank Zappa
> _______________________________________________
> talk at x-query.com
> http://x-query.com/mailman/listinfo/talk

-- 
   | Prof. Dr. Torsten Grust                         grust at in.tum.de |
   |                                 http://www-db.in.tum.de/~grust/ |
   |     Database Systems - Technische Universität München (Germany) |






More information about the talk mailing list