[xquery-talk] Count a specific word in a document
Michael Strasser
M.Strasser at gpo.com
Wed Jun 13 15:11:11 PDT 2007
I am learning XQuery and have set myself a little task that currently I
can't manage.
I have an XHTML document with the complete text of Mendelssohn's
oratorio "Elijah" and wanted to use XQuery to count the number of times
the character of Elijah sings the word "Lord". I was inspired by
Jonathan Robie's blog post last year about word counts of DocBook
documents. (I copied his tokenize() example without fully understanding
it yet.)
I have isolated Elijah's speeches and converted the words to a sequence
of string tokens:
for $elijah in doc("/db/mjs/ElijahLibretto.xhtml")/html
let $elijah-para := $elijah//td/p[i/text() = 'Elijah']
let $txt := string-join($elijah-para/text(), " ")
let $words := tokenize($txt,"(\s|[,.!:;]|[n][b][s][p][;])+")
I can't figure out how to count the number of string tokens that are
'Lord'. I can get them with:
for $word in $words
return $word[$word = 'Lord']
but I can't seem to get the count of them.
Thanks in advance for any help.
Michael Strasser
Brisbane Australia
More information about the talk
mailing list