[xquery-talk] Function and Query Evaluation with No XML Tags Error

Wei, Alice J. ajwei at indiana.edu
Fri Feb 29 08:46:55 PST 2008


Hi, Michael:

  I found the method Kevin provided even in the output his extracted still not even ranked, so I guess maybe going back to distinct-values() may be the way to go.

  Is this inside the function itself?

Here is the XQuery:

declare variable $data := doc("1.xml")//ad;

declare variable $s := $data//head[contains(upper-case(.), 'BOOK')];

declare function local:unique-nodes-by-value($seq as element()*) as element()*
{
 $seq[not(string(.)=preceding-sibling::*/string(.))]
};

local:unique-nodes-by-value($s)

Output:

<?xml version="1.0" encoding="UTF-8"?>
   <head type="sub">Comic Book Back issues</head>
   <head type="sub">We Buy and Sell Comic Books</head>
   <head type="sub">FREE BOOK</head>
   <head type="sub"> GIANT NEW COMIC BOOK CATALOG #6</head>
   <head type="main"> Selling - COMIC BOOKS -Buying <lb/> $100,000 Inventory<lb/>
                        Marvel, D.C., Golden Age </head>
   <head type="main">COMIC Books</head>
   <head>COMIC BOOKS Wanted</head>
   <head type="main">OLD Comic Books!</head>
   <head>Old Comic Books</head>

======================================================
Alice Wei
MIS 2008
School of Library and Information Science
Indiana University Bloomington
ajwei at indiana.edu
________________________________________
From: Michael Kay [mike at saxonica.com]
Sent: Friday, February 29, 2008 4:30 AM
To: 'Michael Kay'; 'Kevin Grover'; Wei, Alice J.
Cc: talk at x-query.com
Subject: RE: [xquery-talk] Function and Query Evaluation with No XML Tags Error

> Assuming that the typed value is the same as the string
> value, you can write
>
> $seq[not(. = preceding-sibling::*)]
>
> If you really need the string value, it's
>
> $seq[not(./string() = preceding-sibling::*/string())]
>
> But using distinct-values() is likely to be a lot more efficient.
>

Actually, I failed to spot another error here. The author of the function
has no way of knowing that the nodes in $seq will be siblings of each other.
Therefore, using preceding-sibling to eliminate duplicates is not just
inefficient, it is plain wrong.

Michael Kay
http://www.saxonica.com/




More information about the talk mailing list