[xquery-talk] Top N Most Common Mistakes

Michael Kay mike at saxonica.com
Sun Aug 5 20:23:08 PDT 2007

> 3) Use of ".../text()" when the user really means 
> ".../data()" or ".../string()" (or should rely on implicit 
> atomization). The trouble is that the answer (after implicit 
> atomization and casting) is often the same, but the meaning 
> is entirely different. This can cause schema-less XML 
> databases like Berkeley DB XML to be unable to use it's 
> indexes in these cases.

Using a schema doesn't actually help, because

a/text() = 'fred'

will match the element

<a>fred<!--that was the short name-->erica</a>

and the schema tells you nothing about comments.

Saxon-SA's on-demand indexing handles this, because it's based on the
predicates actually used in the query: if the query is foo[a/text()='fred']
then the index will be on a/text(), so the indexed values will be "fred" and
"erica". But if there's a persistent index then of course it's likely to be
on the string value or the typed value of the element, which certainly
doesn't help this query.

Unfortunately the /text() usage appeared in very early versions of the
XQuery use cases, and it has been perpetrated in tutorials, textbooks,
academic papers, and benchmarks ever since. It's very rarely seen in XSLT
code, even among beginners, which shows how influential a few published
examples can be.

Michael Kay

More information about the talk mailing list