[xquery-talk] XQuery simulation of XSLT 2.0 grouping

David Sewell dsewell at virginia.edu
Sat Oct 22 13:36:07 PDT 2005

Early in his XSLT 2.0 Programmer's Reference, Michael Kay presents an
example of the power of XSLT 2.0 by giving the brief code required to
produce a word frequency list, sorted in descending order of frequency,
for all the words in a document (using Shakespeare's "Othello" in XML as
an example). This is the template that does the work:

  <xsl:template match="/">
      <xsl:for-each-group group-by="." select="
            for $w in tokenize(string(.), '\W+') return lower-case($w)">
        <xsl:sort select="count(current-group())" order="descending"/>
        <word word="{current-grouping-key()}" frequency="{count(current-group())}"/>

The following XQuery produces the identical output:

  declare variable $corpus :=
      for $w in tokenize(doc("othello.xml"), '\W+') return lower-case($w);
  declare variable $wordList := distinct-values($corpus);
  <wordcount> {
       for $w in $wordList
       let $freq := count($corpus[. eq $w])
       order by $freq descending
       return <word word="{$w}" frequency="{$freq}"/>

However, on my system the XSLT version takes 1.93 seconds to execute
using Saxon 8.51, while the XQuery takes 210 seconds. I realize that
XQuery 1.0 does not contain the grouping facilities of XSLT 2.0, but
I still have a couple of questions:

1. Am I overlooking a more efficient way of writing the query?

2. If not, is the assumption that one will need to rely on
   implementation-dependent optimization for this type of
   XQuery code, possibly relying on extension functions?


David Sewell, Editorial and Technical Manager
Electronic Imprint, The University of Virginia Press
PO Box 400318, Charlottesville, VA 22904-4318 USA
Courier: 310 Old Ivy Way, Suite 302, Charlottesville VA 22903
Email: dsewell at virginia.edu   Tel: +1 434 924 9973
Web: http://www.ei.virginia.edu/

More information about the talk mailing list