[xquery-talk] how to optimaly denormalize 1-to-many relationships with XQuery
David Lee
dlee at calldei.com
Mon Aug 13 06:04:51 PDT 2012
> But as a side note: Do you have any insight in how expensive child, parent,
> ancestor and descendant operations are? Should one avoid those if possible?
What you ask is really entirely processor and system configuration dependent.
I suggest asking on a mailing list specific to that processor/vendor.
(who knows I might see you on that list too !)
----------------------------------------
David A. Lee
dlee at calldei.com
http://www.xmlsh.org
> -----Original Message-----
> From: Robby Pelssers [mailto:Robby.Pelssers at nxp.com]
> Sent: Monday, August 13, 2012 4:43 AM
> To: David Lee; Adam Retter
> Cc: talk at x-query.com
> Subject: RE: [xquery-talk] how to optimaly denormalize 1-to-many
> relationships with XQuery
>
> In fact that query returns results under 1 second over a collection of 20k
> documents. The documents can be 50k - 100k in size but the data I need to
> extract is not deeply nested in the document hierarchy.
>
> So performance is not (yet) a real focus point but it was more of a 'Can I
> improve on this or learn a new trick' question ;-)
>
> But as a side note: Do you have any insight in how expensive child, parent,
> ancestor and descendant operations are? Should one avoid those if possible?
>
> Robby
>
> -----Original Message-----
> From: David Lee [mailto:dlee at calldei.com]
> Sent: Sunday, August 12, 2012 9:23 PM
> To: Adam Retter
> Cc: Robby Pelssers; talk at x-query.com
> Subject: Re: [xquery-talk] how to optimaly denormalize 1-to-many
> relationships with XQuery
>
> I wouldn't necessarily be worried about iteration efficiency until you try it.
> Often iteration Is cheep compared to element creation. It all depends on the
> processor , data, and other issues.
> Remember "iteration" doesn't necessarily imply temporal literation, unlike in
> procedural languages.
>
>
> Sent from my iPad (excuse the terseness)
> David A Lee
> dlee at calldei.com
>
>
> On Aug 12, 2012, at 7:02 AM, "Adam Retter" <adam.retter at googlemail.com>
> wrote:
>
> > I dont think you can do this in one pass. However, depending on the
> > implementation its impossible to know how many passed the processor
> > will actually make over the source to fulfil the query.
> >
> > However, if we were to assume each FLWOR expression is a pass over the
> > source data, then I think the following implementation could be more
> > efficient. It really depends on how the implementation handles
> > descendant-or-self and ancestor selection.
> >
> >
> > let $basicTypes := /BasicType
> > <Result>
> > <BasicTypes>
> > {
> > $basicTypes
> > }
> > </BasicTypes>
> > <ProductTypes>
> > {
> > for $productType in $basicTypes//ProductType return
> > <ProductType>
> > {
> > $productType/*,
> > <BasicType ref-id="{$productType/ancestor::BasicType/@id}"/>
> > }
> > </ProductTypes>
> > }
> > </ProductTypes>
> > <SalesItems>
> > {
> > for $salesItem $basicTypes//SalesItem return
> > <SalesItem>
> > {
> > $salesItem/*,
> > <ProductType ref-id="{$salesItem/ancestor::ProductType/@id}"/>
> > }
> > </SalesItem>
> > }
> > </SalesItems>
> > </Result>
> >
> >
> >
> > On 10 August 2012 15:32, Robby Pelssers <Robby.Pelssers at nxp.com> wrote:
> >> Hi all,
> >>
> >> Suppose I have a collection of XML documents looking like this:
> >>
> >> Basictype has 1 to many Product types.
> >> Producttype has 1 to many Sales items.
> >>
> >> Example snippet:
> >> ---------------------------------
> >> <BasicType id="PH3330L">
> >> <Status>End of life</Status>
> >> ...
> >> <ProductTypes>
> >> <ProductType id="xxx">
> >> <Status>Deprecated</Status>
> >> ...
> >> <SalesItems>
> >> <SalesItem id="yyy">
> >> <Owner>abcde</Owner>
> >> </SalesItem>
> >> </SalesItems>
> >> </ProductType>
> >> </ProductTypes>
> >> </BasicType>
> >>
> >> Now I want to generate some data looking like this:
> >>
> >> <Result>
> >> <BasicTypes>
> >> <BasicType id="PH3330L">
> >> <Status>End of life</Status>
> >> </BasicType>
> >> ...
> >> </BasicTypes>
> >> <ProductTypes>
> >> <ProductType id="xxx">
> >> <Status>Deprecated</Status>
> >> <BasicType ref-id="PH3330L"/>
> >> </ProductType>
> >> ...
> >> </ProductTypes>
> >> <SalesItems>
> >> <SalesItem id="yyy">
> >> <Owner>abcde</Owner>
> >> <ProductType ref-id="xxx"/>
> >> </SalesItem>
> >> ...
> >> </SalesItems>
> >> </Result>
> >>
> >> -------------
> >> I have written a query which returns just this but it iterates
> >> - three times over the basictypes
> >> - 2 times over the producttypes
> >> - 1 time over the salesitems
> >>
> >> Is there a better way to get this accomplished in 1 iteration?
> >>
> >>
> >> Pseudo-code:
> >>
> >> let $basictypes := collection("basictypes")
> >> return
> >> <Result>
> >> <BasicTypes>
> >> {
> >> for $basictype in $basictypes
> >> ...do some stuff
> >>
> >> }
> >> </BasicTypes>
> >> <ProductTypes>
> >> {
> >> for $basictype in $basictypes
> >> for $producttype in $basictype/ProductTypes/ProductType
> >> ...do some stuff
> >> }
> >> </ProductTypes>
> >> <SalesItems>
> >> {
> >> for $basictype in $basictypes
> >> for $producttype in $basictype/ProductTypes/ProductType
> >> for $salesitem in $producttype/SalesItems/SalesItem
> >> ...do some stuff
> >> }
> >> </SalesItems>
> >> </Result>
> >>
> >> Robby Pelssers
> >>
> >>
> >> _______________________________________________
> >> talk at x-query.com
> >> http://x-query.com/mailman/listinfo/talk
> >
> >
> >
> > --
> > Adam Retter
> >
> > skype: adam.retter
> > tweet: adamretter
> > http://www.adamretter.org.uk
> > _______________________________________________
> > talk at x-query.com
> > http://x-query.com/mailman/listinfo/talk
> >
>
More information about the talk
mailing list