[xquery-talk] how to optimaly denormalize 1-to-many relationships with XQuery

Robby Pelssers Robby.Pelssers at nxp.com
Mon Aug 13 01:42:44 PDT 2012


In fact that query returns results under 1 second over a collection of 20k documents.  The documents can be 50k - 100k in size but the data I need to extract is not deeply nested in the document hierarchy.

So performance is not (yet) a real focus point but it was more of a 'Can I improve on this or learn a new trick' question ;-)

But as a side note:  Do you have any insight in how expensive child, parent, ancestor and descendant operations are?  Should one avoid those if possible?

Robby

-----Original Message-----
From: David Lee [mailto:dlee at calldei.com] 
Sent: Sunday, August 12, 2012 9:23 PM
To: Adam Retter
Cc: Robby Pelssers; talk at x-query.com
Subject: Re: [xquery-talk] how to optimaly denormalize 1-to-many relationships with XQuery

I wouldn't necessarily be worried about iteration efficiency until you try it.  Often iteration Is cheep compared to element creation.   It all depends on the processor , data, and other issues.
Remember "iteration" doesn't necessarily imply temporal literation, unlike in procedural languages.


Sent from my iPad (excuse the terseness) 
David A Lee
dlee at calldei.com


On Aug 12, 2012, at 7:02 AM, "Adam Retter" <adam.retter at googlemail.com> wrote:

> I dont think you can do this in one pass. However, depending on the
> implementation its impossible to know how many passed the processor
> will actually make over the source to fulfil the query.
> 
> However, if we were to assume each FLWOR expression is a pass over the
> source data, then I think the following implementation could be more
> efficient. It really depends on how the implementation handles
> descendant-or-self and ancestor selection.
> 
> 
> let $basicTypes := /BasicType
> <Result>
>    <BasicTypes>
>    {
>        $basicTypes
>    }
>    </BasicTypes>
>    <ProductTypes>
>    {
>        for $productType in $basicTypes//ProductType return
>            <ProductType>
>            {
>                $productType/*,
>                <BasicType ref-id="{$productType/ancestor::BasicType/@id}"/>
>            }
>            </ProductTypes>
>    }
>    </ProductTypes>
>    <SalesItems>
>    {
>        for $salesItem $basicTypes//SalesItem return
>            <SalesItem>
>            {
>                $salesItem/*,
>                <ProductType ref-id="{$salesItem/ancestor::ProductType/@id}"/>
>            }
>            </SalesItem>
>    }
>    </SalesItems>
> </Result>
> 
> 
> 
> On 10 August 2012 15:32, Robby Pelssers <Robby.Pelssers at nxp.com> wrote:
>> Hi all,
>> 
>> Suppose I have a collection of XML documents looking like this:
>> 
>> Basictype has 1 to many Product types.
>> Producttype has 1 to many Sales items.
>> 
>> Example snippet:
>> ---------------------------------
>> <BasicType id="PH3330L">
>>  <Status>End of life</Status>
>>  ...
>>  <ProductTypes>
>>    <ProductType id="xxx">
>>       <Status>Deprecated</Status>
>>       ...
>>       <SalesItems>
>>         <SalesItem id="yyy">
>>           <Owner>abcde</Owner>
>>         </SalesItem>
>>       </SalesItems>
>>    </ProductType>
>>  </ProductTypes>
>> </BasicType>
>> 
>> Now I want to generate some data looking like this:
>> 
>> <Result>
>>  <BasicTypes>
>>   <BasicType id="PH3330L">
>>     <Status>End of life</Status>
>>    </BasicType>
>>    ...
>>  </BasicTypes>
>>  <ProductTypes>
>>    <ProductType id="xxx">
>>       <Status>Deprecated</Status>
>>       <BasicType ref-id="PH3330L"/>
>>    </ProductType>
>>    ...
>>  </ProductTypes>
>>  <SalesItems>
>>    <SalesItem id="yyy">
>>      <Owner>abcde</Owner>
>>      <ProductType ref-id="xxx"/>
>>    </SalesItem>
>>    ...
>>  </SalesItems>
>> </Result>
>> 
>> -------------
>> I have written a query which returns just this but it iterates
>> - three times over the basictypes
>> - 2 times over the producttypes
>> - 1 time over the salesitems
>> 
>> Is there a better way to get this accomplished in 1 iteration?
>> 
>> 
>> Pseudo-code:
>> 
>> let $basictypes := collection("basictypes")
>> return
>> <Result>
>>  <BasicTypes>
>>   {
>>     for $basictype in $basictypes
>>     ...do some stuff
>> 
>>   }
>>  </BasicTypes>
>>  <ProductTypes>
>>    {
>>       for $basictype in $basictypes
>>       for $producttype in $basictype/ProductTypes/ProductType
>>       ...do some stuff
>>    }
>>  </ProductTypes>
>>  <SalesItems>
>>    {
>>       for $basictype in $basictypes
>>       for $producttype in $basictype/ProductTypes/ProductType
>>       for $salesitem in $producttype/SalesItems/SalesItem
>>       ...do some stuff
>>    }
>>  </SalesItems>
>> </Result>
>> 
>> Robby Pelssers
>> 
>> 
>> _______________________________________________
>> talk at x-query.com
>> http://x-query.com/mailman/listinfo/talk
> 
> 
> 
> -- 
> Adam Retter
> 
> skype: adam.retter
> tweet: adamretter
> http://www.adamretter.org.uk
> _______________________________________________
> talk at x-query.com
> http://x-query.com/mailman/listinfo/talk
> 




More information about the talk mailing list