[xquery-talk] Partitioning data with XQuery

Michael Kay mhk at mhk.me.uk
Tue Apr 26 10:19:10 PDT 2005


In the XSLT world this problem is known as "positional grouping" and it may
be worth googling for that to see whether any of the established techniques
used with XSLT 1.0 are applicable with XQuery. 

In XSLT 2.0 the problem becomes easy:

<xsl:template match="record">
<xsl:for-each-group select="*" 
  group-starting-with="subRecordStart">
  <record>
    <xsl:copy-of select="remove(current-group(), 1)"/>
  </record>
</xsl:for-each-group>
</xsl:template>

In XSLT 1.0 there were two techniques generally used for the general
problem. The first was to turn it into a value-based grouping problem, but
this relies on the generate-id() function which returns a representation of
the node identity as a string, which isn't available in XQuery. The other
technique is to recurse over the sibling axis, typically using
xsl:apply-templates. Assuming that your XQuery implementation offers the
following-sibling axis (life gets really hard if it doesn't...) you can
still do this recursive scan though the coding is a bit different.

However, in your case the problem is a lot easier than the general
positional grouping problem because you know that there are exactly two
sub-records. This means you can do something like:

let $first := subRecordStart[1],
    $second := subRecordStart[2]
return (
    <record>{*[. >> $first and . << $second]}</record>,
    <record>{*[. >> $$second]}</record>

Michael Kay

> -----Original Message-----
> From: talk-bounces at xquery.com 
> [mailto:talk-bounces at xquery.com] On Behalf Of Howard Katz
> Sent: 26 April 2005 00:30
> To: xquery-talk
> Subject: [xquery-talk] Partitioning data with XQuery
> 
> I need to repartition some XML data using XQuery, and I can't 
> see how to do
> it. The basic data looks something like this:
> 
> <record>
>      <subRecordStart/>		(: marks start of new 
> sub-record :)
>      some pcdata_1
>      <someElement_1/>
>      some more pcdata_1
>      <anotherElement_1/>
>      ... etc
> 
>      <subRecordStart/>		(: marks start of new 
> sub-record :)
>      some more pc data_2
>      <yetAnotherElement_2/>
>      yet some more pc data_2
>      <andYetAnotherElement_2/>
>      ... etc
> </record>
> ...
> 
> The contents of each <record> consists of exactly two 
> <subRecordStart/>
> elements, plus some undetermined mixture of elements and text 
> nodes. Each
> <record> needs to be replaced by two new <record> elements formed by
> partioning its contents into two parts. The place where each 
> new record is
> to begin is indicated by a <subRecordStart/> marker, with the first
> <subRecordStart/> marker being the first element child of 
> <record>. Other
> than that and the fact there are exactly two markers per 
> record, the rest of
> the contents are not known in advance.
> 
> On application of the appropriate XQuery, the single record 
> above would be
> replaced by the following two:
> 
> <record>
>      some pcdata_1
>      <someElement_1/>
>      some more pcdata_1
>      <anotherElement_1/>
> </record>
> <record>
>      some more pc data_2
>      <yetAnotherElement_2/>
>      yet some more pc data_2
>      <andYetAnotherElement_2/>
> </record>
> 
> This doesn't look difficult, but a solution eludes me. Can 
> somebody suggest
> an XQuery that would be able to do this?
> TIA,
> Howard
> 
> 
> _______________________________________________
> talk at xquery.com
> http://xquery.com/mailman/listinfo/talk
> 




More information about the talk mailing list