[xquery-talk] where clause

Michael Kay mike at saxonica.com
Thu Mar 28 23:54:06 PST 2013


It's not just the WHERE clause, of course: ORDER BY and GROUP BY also affect the strategy.

In XQuery 1.0 nearly every FLWOR expression could be rewritten as an expression on sequences of items (rather than tuples). The only exception was a rather weird and rare case involving ORDER BY where the order of tuples couldn't be reduced to an ordering of items in the input sequences. Saxon reduced everything to sequence operations, handling this special case with a bit of custom code that essentially wrapped the tuple as a map-like item.

In XQuery 3.0 there are many more FLWOR expressions that can't be rewritten in terms of operations on item sequences, and native support for tuple streams in the run-time engine becomes almost unavoidable. Once you have that support, there's not necessarily any benefit in eliminating the WHERE clauses (though Saxon still attempts to do so, mainly to take advantage of the other optimisations available for filter expressions). The important thing is to promote each conjunctive term of the WHERE clause up the pipeline to the earliest place where it can be evaluated; whether it is then turned into a filter expression is relatively unimportant.

Michael Kay
Saxonica


On 29 Mar 2013, at 07:35, Liam R E Quin wrote:

> 
> On Wed, 2013-03-27 at 13:26 +0000, deBakker, Bas wrote:
>> Wouldn't that be equivalent to
>> 
>>    for $a in expr1, $b in expr2, $c in expr3[$a = $b + .]
>>    return $a
> 
> I notice that BaseX does exactly that rewrite.
> 
> In SQL with a WHERE clause some of the values may be NULL, but that
> can't occur in XQuery today; if it could, there might be tuples without
> a value for $c, in which case the rewrite wouldn't work.
> 
> A smart optimizer with knowledge of the input could rewrite at runtime
> to put the [predicate] on the expression likely to have the fewest
> nodes, or could rewrite to say, for $a in expr1[. ge min($b) + min($c)];
> this sort of rewrite can turn a theoretically O(n^3) operation into O(n)
> in practice.
> 
> I think the answer is, use "where" when it makes the query more
> readable, or if there are positional or grouping clauses. Some
> implementatins do more optimizatoins than others, though, so sometimes
> readability ends up second to speed.
> 
> Liam
> 
> -- 
> Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
> Pictures from old books: http://fromoldbooks.org/
> Ankh: irc.sorcery.net irc.gnome.org freenode/#xml
> 
> _______________________________________________
> talk at x-query.com
> http://x-query.com/mailman/listinfo/talk




More information about the talk mailing list