[xquery-talk] Screen-scraping with XQuery

Erik Bruchez ebruchez at orbeon.com
Fri Apr 1 15:58:09 PST 2005


Michael Kay wrote:

 >><a href="{concat("a", &lt;B>thing&lt;/B>}"/>
 >>
 >>Does this make a difference for XQuery?
 >
 > Yes, it does. That's not a legal query any more. XQuery only allows
 > &lt; in character content or in string literals, not in places where
 > an operator or element constructor might be expected.
 >
 > I guess it's no big deal for XPL if you only support the subset of
 > XQuery that's well-formed XML. It's easy enough for users to avoid
 > such constructs (and probably good practice).

I agree and so far we haven't really had a problem.

 > More of a practical issue are the operators such as "<", "<=" and
 > "<<" which in XQuery syntax must be unescaped. It's harder to avoid
 > using these, although you can always invert the operands and use
 > ">", ">=", ">>".

In most cases (not all?) you can use "lt" and "le" for the first two.

 > (I'm actually trying to write a WG proposal on embedding XQuery in
 > XML at the moment. XPL is a good use case for it. I'm not happy with
 > either XQueryX or the "trivial embedding" where all "<" characters
 > are escaped. I would like to see something defined along the lines
 > that XPL uses, but it obviously needs to be specified more precisely
 > for a spec than one often does in a product.)

I agree 100%. We went naturally for this type of XML embedding
(without being aware of the issues you raise above) because it
appeared to make the most sense, e.g. a well-formed XML document can
contain something like:

<xdb:query collection="/db/orbeon/blog-example/blogs" 
create-collection="true">
     xquery version "1.0";
     <categories>
         {
         for $i in (/blog[username = 'ebruchez' and blog-id = 
'123'])[1]/categories/category
         return
             <category>
                 <name>{xs:string($i/name)}</name>
                 <id>{count($i/preceding-sibling::category) + 1}</id>
             </category>
         }
     </categories>
</xdb:query>

It would definitely be good if that kind of solution was
standardized. I can't imagine we will be the only ones desiring
something like this.

XqueryX is quite overkill. I can't imagine anybody writing anything
by hand with that syntax.

-Erik


More information about the talk mailing list