[xquery-talk] Partitioning data with XQuery
Michael Kay
mhk at mhk.me.uk
Tue Apr 26 18:35:09 PDT 2005
Fair enough. It just takes a bit of getting used to writing
$x/../node()[.>>$x][1] when you are used to writing
$x/following-sibling::node()[1]. Makes life tough for the optimizer, too...
Michael Kay
http://www.saxonica.com/
> -----Original Message-----
> From: Michael Rys [mailto:mrys at microsoft.com]
> Sent: 26 April 2005 16:59
> To: Michael Kay; howardk at fatdog.com; xquery-talk
> Subject: RE: [xquery-talk] Partitioning data with XQuery
>
> Hi Mike, even a general expression for 1<=n<=x groups is not as
> difficult in XQuery without using the following sibling axis:
>
> for $r in /record/subRecordStart
> return
> <record>{
> /record/node()[(. << (/record/subRecordStart[. >> $r])[1] and . >>
> $r)
> or ($r is (/record/subRecordStart)[last()] and . >> $r)]
> }</record>
>
> Best regards
> Michael
>
> > -----Original Message-----
> > From: talk-bounces at xquery.com [mailto:talk-bounces at xquery.com] On
> Behalf
> > Of Michael Kay
> > Sent: Tuesday, April 26, 2005 1:19 AM
> > To: howardk at fatdog.com; 'xquery-talk'
> > Subject: RE: [xquery-talk] Partitioning data with XQuery
> >
> > In the XSLT world this problem is known as "positional grouping" and
> it
> > may
> > be worth googling for that to see whether any of the established
> > techniques
> > used with XSLT 1.0 are applicable with XQuery.
> >
> > In XSLT 2.0 the problem becomes easy:
> >
> > <xsl:template match="record">
> > <xsl:for-each-group select="*"
> > group-starting-with="subRecordStart">
> > <record>
> > <xsl:copy-of select="remove(current-group(), 1)"/>
> > </record>
> > </xsl:for-each-group>
> > </xsl:template>
> >
> > In XSLT 1.0 there were two techniques generally used for the general
> > problem. The first was to turn it into a value-based
> grouping problem,
> but
> > this relies on the generate-id() function which returns a
> representation
> > of
> > the node identity as a string, which isn't available in XQuery. The
> other
> > technique is to recurse over the sibling axis, typically using
> > xsl:apply-templates. Assuming that your XQuery implementation offers
> the
> > following-sibling axis (life gets really hard if it doesn't...) you
> can
> > still do this recursive scan though the coding is a bit different.
> >
> > However, in your case the problem is a lot easier than the general
> > positional grouping problem because you know that there are exactly
> two
> > sub-records. This means you can do something like:
> >
> > let $first := subRecordStart[1],
> > $second := subRecordStart[2]
> > return (
> > <record>{*[. >> $first and . << $second]}</record>,
> > <record>{*[. >> $$second]}</record>
> >
> > Michael Kay
> >
> > > -----Original Message-----
> > > From: talk-bounces at xquery.com
> > > [mailto:talk-bounces at xquery.com] On Behalf Of Howard Katz
> > > Sent: 26 April 2005 00:30
> > > To: xquery-talk
> > > Subject: [xquery-talk] Partitioning data with XQuery
> > >
> > > I need to repartition some XML data using XQuery, and I can't
> > > see how to do
> > > it. The basic data looks something like this:
> > >
> > > <record>
> > > <subRecordStart/> (: marks start of new
> > > sub-record :)
> > > some pcdata_1
> > > <someElement_1/>
> > > some more pcdata_1
> > > <anotherElement_1/>
> > > ... etc
> > >
> > > <subRecordStart/> (: marks start of new
> > > sub-record :)
> > > some more pc data_2
> > > <yetAnotherElement_2/>
> > > yet some more pc data_2
> > > <andYetAnotherElement_2/>
> > > ... etc
> > > </record>
> > > ...
> > >
> > > The contents of each <record> consists of exactly two
> > > <subRecordStart/>
> > > elements, plus some undetermined mixture of elements and text
> > > nodes. Each
> > > <record> needs to be replaced by two new <record>
> elements formed by
> > > partioning its contents into two parts. The place where each
> > > new record is
> > > to begin is indicated by a <subRecordStart/> marker, with
> the first
> > > <subRecordStart/> marker being the first element child of
> > > <record>. Other
> > > than that and the fact there are exactly two markers per
> > > record, the rest of
> > > the contents are not known in advance.
> > >
> > > On application of the appropriate XQuery, the single record
> > > above would be
> > > replaced by the following two:
> > >
> > > <record>
> > > some pcdata_1
> > > <someElement_1/>
> > > some more pcdata_1
> > > <anotherElement_1/>
> > > </record>
> > > <record>
> > > some more pc data_2
> > > <yetAnotherElement_2/>
> > > yet some more pc data_2
> > > <andYetAnotherElement_2/>
> > > </record>
> > >
> > > This doesn't look difficult, but a solution eludes me. Can
> > > somebody suggest
> > > an XQuery that would be able to do this?
> > > TIA,
> > > Howard
> > >
> > >
> > > _______________________________________________
> > > talk at xquery.com
> > > http://xquery.com/mailman/listinfo/talk
> > >
> >
> >
> > _______________________________________________
> > talk at xquery.com
> > http://xquery.com/mailman/listinfo/talk
>
>
More information about the talk
mailing list