[xquery-talk] Partitioning data with XQuery

Michael Kay mhk at mhk.me.uk
Tue Apr 26 18:35:09 PDT 2005


Fair enough. It just takes a bit of getting used to writing
$x/../node()[.>>$x][1] when you are used to writing
$x/following-sibling::node()[1]. Makes life tough for the optimizer, too...

Michael Kay
http://www.saxonica.com/

 

> -----Original Message-----
> From: Michael Rys [mailto:mrys at microsoft.com] 
> Sent: 26 April 2005 16:59
> To: Michael Kay; howardk at fatdog.com; xquery-talk
> Subject: RE: [xquery-talk] Partitioning data with XQuery
> 
> Hi Mike, even a general expression for 1<=n<=x groups is not as
> difficult in XQuery without using the following sibling axis:
> 
> for $r in /record/subRecordStart
> return
>   <record>{
>     /record/node()[(. << (/record/subRecordStart[. >> $r])[1] and . >>
> $r)
>         or ($r is (/record/subRecordStart)[last()] and . >> $r)]
>   }</record>
> 
> Best regards
> Michael
> 
> > -----Original Message-----
> > From: talk-bounces at xquery.com [mailto:talk-bounces at xquery.com] On
> Behalf
> > Of Michael Kay
> > Sent: Tuesday, April 26, 2005 1:19 AM
> > To: howardk at fatdog.com; 'xquery-talk'
> > Subject: RE: [xquery-talk] Partitioning data with XQuery
> > 
> > In the XSLT world this problem is known as "positional grouping" and
> it
> > may
> > be worth googling for that to see whether any of the established
> > techniques
> > used with XSLT 1.0 are applicable with XQuery.
> > 
> > In XSLT 2.0 the problem becomes easy:
> > 
> > <xsl:template match="record">
> > <xsl:for-each-group select="*"
> >   group-starting-with="subRecordStart">
> >   <record>
> >     <xsl:copy-of select="remove(current-group(), 1)"/>
> >   </record>
> > </xsl:for-each-group>
> > </xsl:template>
> > 
> > In XSLT 1.0 there were two techniques generally used for the general
> > problem. The first was to turn it into a value-based 
> grouping problem,
> but
> > this relies on the generate-id() function which returns a
> representation
> > of
> > the node identity as a string, which isn't available in XQuery. The
> other
> > technique is to recurse over the sibling axis, typically using
> > xsl:apply-templates. Assuming that your XQuery implementation offers
> the
> > following-sibling axis (life gets really hard if it doesn't...) you
> can
> > still do this recursive scan though the coding is a bit different.
> > 
> > However, in your case the problem is a lot easier than the general
> > positional grouping problem because you know that there are exactly
> two
> > sub-records. This means you can do something like:
> > 
> > let $first := subRecordStart[1],
> >     $second := subRecordStart[2]
> > return (
> >     <record>{*[. >> $first and . << $second]}</record>,
> >     <record>{*[. >> $$second]}</record>
> > 
> > Michael Kay
> > 
> > > -----Original Message-----
> > > From: talk-bounces at xquery.com
> > > [mailto:talk-bounces at xquery.com] On Behalf Of Howard Katz
> > > Sent: 26 April 2005 00:30
> > > To: xquery-talk
> > > Subject: [xquery-talk] Partitioning data with XQuery
> > >
> > > I need to repartition some XML data using XQuery, and I can't
> > > see how to do
> > > it. The basic data looks something like this:
> > >
> > > <record>
> > >      <subRecordStart/>		(: marks start of new
> > > sub-record :)
> > >      some pcdata_1
> > >      <someElement_1/>
> > >      some more pcdata_1
> > >      <anotherElement_1/>
> > >      ... etc
> > >
> > >      <subRecordStart/>		(: marks start of new
> > > sub-record :)
> > >      some more pc data_2
> > >      <yetAnotherElement_2/>
> > >      yet some more pc data_2
> > >      <andYetAnotherElement_2/>
> > >      ... etc
> > > </record>
> > > ...
> > >
> > > The contents of each <record> consists of exactly two
> > > <subRecordStart/>
> > > elements, plus some undetermined mixture of elements and text
> > > nodes. Each
> > > <record> needs to be replaced by two new <record> 
> elements formed by
> > > partioning its contents into two parts. The place where each
> > > new record is
> > > to begin is indicated by a <subRecordStart/> marker, with 
> the first
> > > <subRecordStart/> marker being the first element child of
> > > <record>. Other
> > > than that and the fact there are exactly two markers per
> > > record, the rest of
> > > the contents are not known in advance.
> > >
> > > On application of the appropriate XQuery, the single record
> > > above would be
> > > replaced by the following two:
> > >
> > > <record>
> > >      some pcdata_1
> > >      <someElement_1/>
> > >      some more pcdata_1
> > >      <anotherElement_1/>
> > > </record>
> > > <record>
> > >      some more pc data_2
> > >      <yetAnotherElement_2/>
> > >      yet some more pc data_2
> > >      <andYetAnotherElement_2/>
> > > </record>
> > >
> > > This doesn't look difficult, but a solution eludes me. Can
> > > somebody suggest
> > > an XQuery that would be able to do this?
> > > TIA,
> > > Howard
> > >
> > >
> > > _______________________________________________
> > > talk at xquery.com
> > > http://xquery.com/mailman/listinfo/talk
> > >
> > 
> > 
> > _______________________________________________
> > talk at xquery.com
> > http://xquery.com/mailman/listinfo/talk
> 
> 




More information about the talk mailing list