[xquery-talk] how to optimaly denormalize 1-to-many relationships with XQuery

David Lee dlee at calldei.com
Mon Aug 13 06:04:51 PDT 2012


> But as a side note:  Do you have any insight in how expensive child, parent,
> ancestor and descendant operations are?  Should one avoid those if possible?

What you ask is really entirely processor and system configuration dependent.
I suggest asking on a mailing list specific to that processor/vendor.

(who knows I might see you on that list too !)



----------------------------------------
David A. Lee
dlee at calldei.com
http://www.xmlsh.org

> -----Original Message-----
> From: Robby Pelssers [mailto:Robby.Pelssers at nxp.com]
> Sent: Monday, August 13, 2012 4:43 AM
> To: David Lee; Adam Retter
> Cc: talk at x-query.com
> Subject: RE: [xquery-talk] how to optimaly denormalize 1-to-many
> relationships with XQuery
> 
> In fact that query returns results under 1 second over a collection of 20k
> documents.  The documents can be 50k - 100k in size but the data I need to
> extract is not deeply nested in the document hierarchy.
> 
> So performance is not (yet) a real focus point but it was more of a 'Can I
> improve on this or learn a new trick' question ;-)
> 
> But as a side note:  Do you have any insight in how expensive child, parent,
> ancestor and descendant operations are?  Should one avoid those if possible?
> 
> Robby
> 
> -----Original Message-----
> From: David Lee [mailto:dlee at calldei.com]
> Sent: Sunday, August 12, 2012 9:23 PM
> To: Adam Retter
> Cc: Robby Pelssers; talk at x-query.com
> Subject: Re: [xquery-talk] how to optimaly denormalize 1-to-many
> relationships with XQuery
> 
> I wouldn't necessarily be worried about iteration efficiency until you try it.
> Often iteration Is cheep compared to element creation.   It all depends on the
> processor , data, and other issues.
> Remember "iteration" doesn't necessarily imply temporal literation, unlike in
> procedural languages.
> 
> 
> Sent from my iPad (excuse the terseness)
> David A Lee
> dlee at calldei.com
> 
> 
> On Aug 12, 2012, at 7:02 AM, "Adam Retter" <adam.retter at googlemail.com>
> wrote:
> 
> > I dont think you can do this in one pass. However, depending on the
> > implementation its impossible to know how many passed the processor
> > will actually make over the source to fulfil the query.
> >
> > However, if we were to assume each FLWOR expression is a pass over the
> > source data, then I think the following implementation could be more
> > efficient. It really depends on how the implementation handles
> > descendant-or-self and ancestor selection.
> >
> >
> > let $basicTypes := /BasicType
> > <Result>
> >    <BasicTypes>
> >    {
> >        $basicTypes
> >    }
> >    </BasicTypes>
> >    <ProductTypes>
> >    {
> >        for $productType in $basicTypes//ProductType return
> >            <ProductType>
> >            {
> >                $productType/*,
> >                <BasicType ref-id="{$productType/ancestor::BasicType/@id}"/>
> >            }
> >            </ProductTypes>
> >    }
> >    </ProductTypes>
> >    <SalesItems>
> >    {
> >        for $salesItem $basicTypes//SalesItem return
> >            <SalesItem>
> >            {
> >                $salesItem/*,
> >                <ProductType ref-id="{$salesItem/ancestor::ProductType/@id}"/>
> >            }
> >            </SalesItem>
> >    }
> >    </SalesItems>
> > </Result>
> >
> >
> >
> > On 10 August 2012 15:32, Robby Pelssers <Robby.Pelssers at nxp.com> wrote:
> >> Hi all,
> >>
> >> Suppose I have a collection of XML documents looking like this:
> >>
> >> Basictype has 1 to many Product types.
> >> Producttype has 1 to many Sales items.
> >>
> >> Example snippet:
> >> ---------------------------------
> >> <BasicType id="PH3330L">
> >>  <Status>End of life</Status>
> >>  ...
> >>  <ProductTypes>
> >>    <ProductType id="xxx">
> >>       <Status>Deprecated</Status>
> >>       ...
> >>       <SalesItems>
> >>         <SalesItem id="yyy">
> >>           <Owner>abcde</Owner>
> >>         </SalesItem>
> >>       </SalesItems>
> >>    </ProductType>
> >>  </ProductTypes>
> >> </BasicType>
> >>
> >> Now I want to generate some data looking like this:
> >>
> >> <Result>
> >>  <BasicTypes>
> >>   <BasicType id="PH3330L">
> >>     <Status>End of life</Status>
> >>    </BasicType>
> >>    ...
> >>  </BasicTypes>
> >>  <ProductTypes>
> >>    <ProductType id="xxx">
> >>       <Status>Deprecated</Status>
> >>       <BasicType ref-id="PH3330L"/>
> >>    </ProductType>
> >>    ...
> >>  </ProductTypes>
> >>  <SalesItems>
> >>    <SalesItem id="yyy">
> >>      <Owner>abcde</Owner>
> >>      <ProductType ref-id="xxx"/>
> >>    </SalesItem>
> >>    ...
> >>  </SalesItems>
> >> </Result>
> >>
> >> -------------
> >> I have written a query which returns just this but it iterates
> >> - three times over the basictypes
> >> - 2 times over the producttypes
> >> - 1 time over the salesitems
> >>
> >> Is there a better way to get this accomplished in 1 iteration?
> >>
> >>
> >> Pseudo-code:
> >>
> >> let $basictypes := collection("basictypes")
> >> return
> >> <Result>
> >>  <BasicTypes>
> >>   {
> >>     for $basictype in $basictypes
> >>     ...do some stuff
> >>
> >>   }
> >>  </BasicTypes>
> >>  <ProductTypes>
> >>    {
> >>       for $basictype in $basictypes
> >>       for $producttype in $basictype/ProductTypes/ProductType
> >>       ...do some stuff
> >>    }
> >>  </ProductTypes>
> >>  <SalesItems>
> >>    {
> >>       for $basictype in $basictypes
> >>       for $producttype in $basictype/ProductTypes/ProductType
> >>       for $salesitem in $producttype/SalesItems/SalesItem
> >>       ...do some stuff
> >>    }
> >>  </SalesItems>
> >> </Result>
> >>
> >> Robby Pelssers
> >>
> >>
> >> _______________________________________________
> >> talk at x-query.com
> >> http://x-query.com/mailman/listinfo/talk
> >
> >
> >
> > --
> > Adam Retter
> >
> > skype: adam.retter
> > tweet: adamretter
> > http://www.adamretter.org.uk
> > _______________________________________________
> > talk at x-query.com
> > http://x-query.com/mailman/listinfo/talk
> >
> 





More information about the talk mailing list