[xquery-talk] Questions about SQL Server 2005 and XML

Frank Cohen fcohen at rainingdata.com
Sun Jan 22 10:22:02 PST 2006


I recently ran across Microsoft's white paper on XML Support In SQL  
Server 2005. The paper is found at:

http://msdn.microsoft.com/xml/default.aspx?pull=/library/en-us/ 
dnsql90/html/sql2k5xml.asp&print=true

It is a good paper with lots of detail. Below are some thoughts I had  
and questions from reading the paper. I'm hoping this spurs some  
discussion about the persistence layer below XQuery.

1) The MS approach to XML is to store XML in a relational db with  
extensions to more natively understand XML constructs. For instance,  
the data is stored in a relation table column of field type XML with  
a lossy technique. SQL Server stores element tags in a binary form,  
it does not retain the order of the attributes, and it limits the  
depth of the hierarchy of an element.

2) To get good performance, one creates a "primary XML index" and  
then one may create secondary index types on the data in that field:  
1) PATH for evaluating where clauses, 2) PROPERTY for property value  
lookups, and 3) VALUE for doing wild card queries. You create each of  
these manually. XML indexes are B+Trees.

3) I wonder if Microsoft has tested SQL Server 2005 with STAR BOD  
schemas? These are typical of UBL-based schemas and use Schema  
includes extensively.  Many XML parsers I have tested will not  
support the complexity in the BODs.

4) SQL Server 2005 supports SOAP over HTTP endpoints to query,  
insert, delete data. The paper does not say which SOAP encoding  
style: Doc-lit, RPC encoded, RPC-lit.

5) The paper says "Data Exchange and Workflow" are key drivers for MS  
to support XML, including "vertical domains such as financial and geo- 
spatial data." GIS defines geospatial field types for XML documents.  
Does SQL Server support GIS field types in XML documents?

One of the CHECK examples in the paper recommends using a constraint  
for a "flower shop that must be within 50 miles of its business  
location." May I express a GIS function for this constraint?

6) SQL Server 2005 takes string, file, and FOR XML as inputs. What  
about taking a stream?

7) The section on Storage Representation makes it seem like if I am  
not expert at creating indexes and using typed XML that SQL Server  
2005 will give me slow performance. Am I reading that correctly?

8) Stored XML is "limited to 2 GB per instance." Is that 2 GB per XML  
document stored in one XML field? Is that for all the XML stored in  
one field? That seems really small to me and makes me wonder why  
there is a limit at all?

Same question on the "depth of XML heirarchy is limited to 128  
levels". Is that 128 levels within each document in an XML field?

9) SQL Server 2005 does not store attribute order in an XML field. In  
SOA and workflow patterns the document that moves from one service or  
operator to the next usually depends on the canonicalization of the  
data where the fidelity to the original XML document is maintained.

10) I am pretty tired of waiting for the XQuery 1.0 spec to be final.  
Come on already!#@^% Most of the XQuery databases already implement  
full text search (Raining Data TigerLogic included.) SQL Server 2005  
supports full text search of element contents. Is the fn:contains()  
schema the way MS would like to see full text in a post XQuery 1.0 spec?

11) The paper covers FOR XML EXPLICIT and includes "FOR XML  
EXPLICIT ... may not perform as well for deep heirarchies." Why is that?

-Frank

---
Frank Cohen, Raining Data, http://www.RainingData.com, phone: 408 236  
7604
http://www.xquerynow.com for free XML, XQuery and native XML database  
tips,
techniques and solutions.








More information about the talk mailing list