[xquery-talk] Screen-scraping with XQuery
Erik Bruchez
ebruchez at orbeon.com
Fri Apr 1 15:58:09 PST 2005
Michael Kay wrote:
>><a href="{concat("a", <B>thing</B>}"/>
>>
>>Does this make a difference for XQuery?
>
> Yes, it does. That's not a legal query any more. XQuery only allows
> < in character content or in string literals, not in places where
> an operator or element constructor might be expected.
>
> I guess it's no big deal for XPL if you only support the subset of
> XQuery that's well-formed XML. It's easy enough for users to avoid
> such constructs (and probably good practice).
I agree and so far we haven't really had a problem.
> More of a practical issue are the operators such as "<", "<=" and
> "<<" which in XQuery syntax must be unescaped. It's harder to avoid
> using these, although you can always invert the operands and use
> ">", ">=", ">>".
In most cases (not all?) you can use "lt" and "le" for the first two.
> (I'm actually trying to write a WG proposal on embedding XQuery in
> XML at the moment. XPL is a good use case for it. I'm not happy with
> either XQueryX or the "trivial embedding" where all "<" characters
> are escaped. I would like to see something defined along the lines
> that XPL uses, but it obviously needs to be specified more precisely
> for a spec than one often does in a product.)
I agree 100%. We went naturally for this type of XML embedding
(without being aware of the issues you raise above) because it
appeared to make the most sense, e.g. a well-formed XML document can
contain something like:
<xdb:query collection="/db/orbeon/blog-example/blogs"
create-collection="true">
xquery version "1.0";
<categories>
{
for $i in (/blog[username = 'ebruchez' and blog-id =
'123'])[1]/categories/category
return
<category>
<name>{xs:string($i/name)}</name>
<id>{count($i/preceding-sibling::category) + 1}</id>
</category>
}
</categories>
</xdb:query>
It would definitely be good if that kind of solution was
standardized. I can't imagine we will be the only ones desiring
something like this.
XqueryX is quite overkill. I can't imagine anybody writing anything
by hand with that syntax.
-Erik
More information about the talk
mailing list