[xquery-talk] search hit count

Martin Probst martin at x-hive.com
Wed Jul 19 00:37:13 PDT 2006


Hi,

>   I'm using Berkeley DB XML, coupled with cocoon, using xquery to
>   search a set of XML documents. I need to keep track of how many
>   times the search phrase is found. The count() function give me how
>   many documents in which the search is found, but not the individual
>   hit count.
>
>   Does anyone know a simple way of getting the individual hit count?

That depends on the kind of query. If it's a path-like query, e.g.
doc('...')[ //p[contains(., 'foo')] ]
it should be easy to refactor the query. In case of a simple text  
query, you might get something close to the real number using tokenize 
(), e.g.
for $f in //foo[contains(., 'foo')]
return count( tokenize($f, 'foo'))

This is however very crude. Everything else will need more  
sophisticated text handling (read: more work and less speed).

I guess what you actually want is some sort of search ranking. There  
is a proposed extension to XQuery, called XQuery Full Text, which  
includes those operators. No idea if Berkley DB implements them, though.

HTH,
Martin Probst
X-Hive


More information about the talk mailing list