[xquery-talk] Finding a XML-Database to fit our needs
Torsten Grust
grust at in.tum.de
Sat Dec 15 18:17:08 PST 2007
Hi Johan,
your shopping list appears to perfectly fit the capabilities of DB2 V9
and its built-in pureXML XQuery processor. This particularly relates
to your requirement to host extensive collections of moderately sixed
XML documents. DB2 V9 is definitely worth a look, I'd say.
http://www.ibm.com/developerworks/wikis/display/db2xml
You ask for a ``native XML database'', and while I do not believe that
the
native vs. alien(?) categorization makes much sense(*), DB2 V9's
internals
(query processor and storage engine) have been enhanced to include XML-
and XQuery-specific query operators as well as storage and index
structures.
Does this make the originally purely relational DB2 database kernel a
native XML database...? You decide.
Cheers,
--Torsten
(*) There's no ``native'' representation of XML inside computers,
besides
the serialized XML text, maybe. I don't see how a DOM or other
pointer-based
representation of an XML instance is in any way ``more native'' than a
tabular encoding, for example. Uh, I start to sound like Dana
Florescu... ;-)
On Dec 15, 2007, at 16:30, Johan Mörén wrote:
> Hi all,
>
> I'm new to the list and work for a company in Stockholm, Sweden.
>
> We are currently evaluating a move from storing our data in a RDMBS
> (Oracle 10g) to storing it as native XML. The reason for doing this
> is that all communication to the persistence layer is done via SOAP
> and we believe we can save a lot of effort and time if we persist
> our data in the same format as we communicate it to the outside world.
>
> We are looking for a solution that can handle approximately 16 000
> 000 documents ranging from 50 to 200 KB in size. About 5k to 20k
> documents will be updated daily. The documents are all derived from
> the same base type and are described by a common schema. There are 5
> sub-types that could be split into different collections where the
> largest, in terms of number of documents, would be about 8-9 million
> in size.
>
> Practically all documents have relationships described to documents
> belonging both to their own type but also to the other types so
> navigation of these relationships must be possible for querying
> purposes.
>
> The documents are very data centric, containing very little free
> text. But some fields will need to be backed by a free-text-index
> for querying. Since operators will work online with the data, query
> times will need to be reasonably fast for not to complex queries.
>
> Apart from the above. The database should support:
>
> * Concurrent inserts and updates.
> * XQuery 1.0 support.
> * Any fragmentation of the documents (to handle the size) should be
> transparently handled by the database.
> * Both commercial and open source alternatives are of interest.
>
> Any input, experiences and pointers on where to look would be very
> much appreciated.
>
> Cheers!
>
> /Johan
>
> --
> "You can't always write a chord ugly enough to say what you want to
> say,
> so sometimes you have to rely on a giraffe filled with whipped
> cream." - Frank Zappa
> _______________________________________________
> talk at x-query.com
> http://x-query.com/mailman/listinfo/talk
--
| Prof. Dr. Torsten Grust grust at in.tum.de |
| http://www-db.in.tum.de/~grust/ |
| Database Systems - Technische Universität München (Germany) |
More information about the talk
mailing list