[xquery-talk] XQuery function to do elementX/not-elementX chunking?

David Sewell dsewell at virginia.edu
Tue Sep 13 17:01:23 PDT 2005


Given an XML element like this:

  <ref>This text has <i>italics</i> and <ref>an embedded ref</ref> and
    more text including <b>boldface</b> and <ref>another ref</ref>
    and a bit more text.</ref>

I want to break this into a sequence of sequences of its node children
like so (with text nodes represented as strings, ignoring linebreaks):

(
  ( 'This text has ', <i>italics</i>, ' and ' ) ,
  <ref>an embedded ref</ref>,
  ( ' and more text including ', <b>boldface</b>, ' and ' ),
  <ref>another ref</ref>,
  'and a bit more text.'
)

In other words, pull out alternating sequences of (1) <ref> elements and
(2) other nodes that are not <ref> elements. (The practical application is
so that the <ref>s can be transformed into HTML <span>s without
permitting embedded <spans>s -- they are HTML-legal but cause certain
problems.)

I was able to do this by writing a 10-line function that relies on a
fairly clunky process of selecting all the <ref> children and then
chunking the other nodes that precede and/or follow them; it relies on
some fairly ugly use of preceding-sibling(), following-sibling(),
name(), and the '>>' operator. It's so ugly that I don't want to inflict
it this list (unless someone insists).

Does anyone have a simple, elegant way to do this?

-- 
David Sewell, Editorial and Technical Manager
Electronic Imprint, The University of Virginia Press
PO Box 400318, Charlottesville, VA 22904-4318 USA
Courier: 310 Old Ivy Way, Suite 302, Charlottesville VA 22903
Email: dsewell at virginia.edu   Tel: +1 434 924 9973
Web: http://www.ei.virginia.edu/


More information about the talk mailing list