[xquery-talk] Function and Query Evaluation with No XML Tags Error

Kevin Grover kevin at kevingrover.net
Tue Feb 26 18:38:10 PST 2008


On Tue, Feb 26, 2008 at 8:21 AM, Wei, Alice J. <ajwei at indiana.edu> wrote:

> Hi, David:
>
>  I think I am having a different issue different from what I had several
> weeks ago.
>   What I have modified is from:
>
> let $ad :=
> fn:collection("xmldb:exist://db/cbml")//ad/p[contains(upper-case(.),
> 'BOOK')]
> let $sorted_result:=
> for $doc in distinct-values($ad)
> order by $doc
> return $doc
> for $r at $count in $sorted_result
> let $nodes := $ad[. = $r][1]
> return
> <ad>
> {$nodes}
> </ad>
>

>  by trying to put it in user-defined functions, which is currently giving
> me quite a bit of problems from itself.
>
> declare boundary-space preserve;
> declare function local:distinct(
>     $seq as xs:anyAtomicType*)
>      as xs:anyAtomicType
>  {
>  let $doc := distinct-values($seq)
>  order by $doc
>  return
>  <ad>{$doc} </ad>
>  };
>
>  for $ad in
>
> distinct-values(collection("xmldb:exist://db/cbml")//ad/p[contains(upper-case(.$
>  return
> <ad>{local:distinct($ad)}</ad>
>
> As I mentioned before, I only intend to have the output be appeared in the
> same way as the above XQuery that is not in the function.
> Is this possible?
>
> Thanks for your help.
> ======================================================
> Alice Wei
> MIS 2008
> School of Library and Information Science
> Indiana University Bloomington
> ajwei at indiana.edu
> ________________________________________
> From: David Carlisle [davidc at nag.co.uk]
> Sent: Tuesday, February 26, 2008 10:33 AM
> To: Wei, Alice J.
> Cc: talk at x-query.com
> Subject: Re: [xquery-talk] Function and Query Evaluation with No XML Tags
> Error
>
> Alice,
>
> Alice it seems hard to help as you have been essentially posting the
> same code, with the same error for weeks,
>
> When you go
>
>  for $ad in distinct-values(...anything)
>
> then, on each iteration $ad will be a single string (one of the values)
>
> So on each call to local:distinct($ad) you are passing in a single string.
> so
> doing operations such as distinct-values() or for or order by are all
> essentialy null-operations. It's not an error to take all the
> unique values from a sequence of length one and then sort them, but you
> will always
> just get back the value you started with.
>
>
> See for example:
>
> http://x-query.com/pipermail/talk/2008-January/002436.html
>
> David
>
>
distinct-values returns a list of values (strings) NOT elements.  (hence the
'values').   It's throwing away all of the XML element markup and just
keeping the string values (all smashed together).

Example XML file:
<?xml version="1.0"?>
<data>
  <record>
    <name>Fred Flinstone</name>
    <street>Fred's Street</street>
  </record>
  <record>
    <name>Fred Flinstone</name>
    <street>Fred's Street</street>
  </record>
  <record>
    <name>Barney Rubble</name>
    <street>Barney's Street</street>
  </record>
  <record>
    <name>Wilma Flinstone</name>
    <street>Wilma's Street</street>
  </record>
  <record>
    <name>Betty Rubble</name>
    <street>Betty's Street</street>
  </record>
</data>

NOTICE: Fred's entry is duplicated!

Example Query:
for $r in distinct-values(/data/record)
return $r


Results:
<?xml version="1.0" encoding="UTF-8"?>
    Fred Flinstone
    Fred's Street

    Barney Rubble
    Barney's Street

    Wilma Flinstone
    Wilma's Street

    Betty Rubble
    Betty's Street

NOTICE: No XML markup.  It was all stripped.  It's like calling string on
it: <record><name> and <street> have been removed: the data flattened into a
string.

It's like this:

for $r in /data/record
return string($r)


Results:
<?xml version="1.0" encoding="UTF-8"?>
    Fred Flinstone
    Fred's Street

    Fred Flinstone
    Fred's Street

    Barney Rubble
    Barney's Street

    Wilma Flinstone
    Wilma's Street

    Betty Rubble
    Betty's Street

Again: no XML markup.  They're strings (because I explicitly converted them
to strings).  However, this time, I DID get the duplicated entry for Fred.

Will you really have different nodes (with subnodes) that contain completely
duplicated data?  If so, I have no suggestions.  If not and you're really
getting multiple instances of the SAME node, then use the id function to
test for uniqueness.

I figured I could use this:
(: ** Generates runtime errror ** :)
for $r in /data/record[not(string(.)=string(preceding::*))]
return string($r)

But it generates a complaint (at runtime):
Description: A sequence of more than one item is not allowed as the first
argument of string() (<record/>, <name/>, ...)
** SEE More at the bottom: I figured it out.

I tried a few other things to get distinct nodes based on the content, but
none of the others would even parse : I'm still mostly in the dark about the
magic of XQuery and XPath.

Try removing the disctinct-values (and iterate over the elements) and you
may get something closer to what you are expecting.

** (corrected from above)
Here's a unique by value
for $r in /data/record[not(string(.)=string((preceding-sibling::*)[1]))]
return string($r)

Results:
<?xml version="1.0" encoding="UTF-8"?>
    Fred Flinstone
    Fred's Street

    Barney Rubble
    Barney's Street

    Wilma Flinstone
    Wilma's Street

    Betty Rubble
    Betty's Street

Again.  It's unique (because I made it unique in the XPath), but it's
iterating over the nodes.  It's a string, because I explicitly convert them
to strings.  I could have also done:

for $r in /data/record[not(string(.)=string((preceding-sibling::*)[1]))]
return normalize-space($r)


Results:
<?xml version="1.0" encoding="UTF-8"?>Fred Flinstone Fred's Street Barney
Rubble Barney's Street Wilma Flinstone Wilma's Street Betty Rubble Betty's
Street

NOTICES: it all a string (because normalize-space converts things to string)
but 'extra' whitespace is stripped.

Or:
for $r in /data/record[not(string(.)=string((preceding-sibling::*)[1]))]
return <newelement>{normalize-space($r)}</newelement>

Results:
<?xml version="1.0" encoding="UTF-8"?>
<newelement>Fred Flinstone Fred's Street</newelement>
<newelement>Barney Rubble Barney's Street</newelement>
<newelement>Wilma Flinstone Wilma's Street</newelement>
<newelement>Betty Rubble Betty's Street</newelement>

NOTICE: String results wrapped in a new tag.

Or:
for $r in /data/record[not(string(.)=string((preceding-sibling::*)[1]))]
return <newelement>{$r}</newelement>

Results:
<?xml version="1.0" encoding="UTF-8"?>
<newelement>
   <record>
      <name>Fred Flinstone</name>
      <street>Fred's Street</street>
  </record>
</newelement>
<newelement>
   <record>
      <name>Barney Rubble</name>
      <street>Barney's Street</street>
  </record>
</newelement>
<newelement>
   <record>
      <name>Wilma Flinstone</name>
      <street>Wilma's Street</street>
  </record>
</newelement>
<newelement>
   <record>
      <name>Betty Rubble</name>
      <street>Betty's Street</street>
  </record>
</newelement>

NOTICE: Original XML elements (and sub elements) wrapped in a new tag.

Or, if you plan on using that 'logic' over and over, you can create a
function for it:

XQuery: This is the same as (functionally) the previous example, it just
defines (and uses) a local function

declare function local:unique-nodes-by-value($seq as element()*) as
element()*
{
  for $r in $seq[not(string(.)=string((preceding-sibling::*)[1]))]
  return $r
};

for $r in local:unique-nodes-by-value(/data/record)
return <newelement>{$r}</newelement>

Using XPaths, you can pick and choose what element/sub element (values,
content, or sub-content) you get.

I hope this helps some.

Keep in mind, I'm relatively new at this so there are probably better ways.

- Kevin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://x-query.com/pipermail/talk/attachments/20080226/c78ad75c/attachment.htm


More information about the talk mailing list