From michaelmalak at yahoo.com Tue Jul 3 14:50:40 2007 From: michaelmalak at yahoo.com (Michael Malak) Date: Tue Jul 3 13:51:38 2007 Subject: [xquery-talk] Newbie Q: Saxon-B, xmlns Message-ID: <697890.86187.qm@web57710.mail.re3.yahoo.com> I'm attempting to use Saxon-B with C#/.NET. string xhtml = "" + "" + "Hello World"; string query = "data(/html/head/title)"; Processor processor = new Processor(); XmlDocument input = new XmlDocument(); input.LoadXml(xhtml); XQueryCompiler compiler = processor.NewXQueryCompiler(); XQueryExecutable exp = compiler.Compile(query); XQueryEvaluator eval = exp.Load(); eval.ContextItem = processor.NewDocumentBuilder().Build(new XmlNodeReader(input)); Serializer qout = new Serializer(); StringWriter sw = new StringWriter(); qout.SetOutputProperty(Serializer.METHOD, "xml"); qout.SetOutputWriter(sw); eval.Run(qout); Console.WriteLine(sw); When I execute this as-is, I get back an empty XML document. If I change the xmlns attribute to be xmlns:html, it works. I presume that's because that removes the HTML elements out of the default namespace and Saxon-B is not schema aware so it happily processes the HTML tags as if they're not HTML. How can I change my query so that it works with the source XHTML above as-is? From pc.subscriptions at gmail.com Wed Jul 4 08:11:56 2007 From: pc.subscriptions at gmail.com (Peter) Date: Tue Jul 3 22:12:59 2007 Subject: [xquery-talk] Newbie Q: Saxon-B, xmlns In-Reply-To: <697890.86187.qm@web57710.mail.re3.yahoo.com> References: <697890.86187.qm@web57710.mail.re3.yahoo.com> Message-ID: <047201c7bdf9$dbd82370$0a0a0a0a@eclipseinternational.com> > > If I > change the xmlns attribute to be xmlns:html, it works. I presume > that's because that removes the HTML elements out of the default > namespace and Saxon-B is not schema aware so it happily processes the > HTML tags as if they're not HTML. The first part of your assumptions is kind of correct but the second is not: changing the namespace definition changes your source document (moving the elements out of the namespace) and therefore the xquery(xpath) expression which looks for elements without a namespace finds them It has however nothing to do with Saxon-B not being schema aware. A schema aware processor would allow you to use "import schema" (http://www.w3.org/TR/xquery/#id-schema-import-feature) but all xquery processors should be able to deal with namespaces. Changing your query to make it work should be easy - just add a namespace declaration to it. string query = " declare default element namespace "http://www.w3.org/1999/xhtml"; data(/html/head/title)"; Hth, Peter From rpbourret at rpbourret.com Wed Jul 11 00:38:54 2007 From: rpbourret at rpbourret.com (Ronald Bourret) Date: Tue Jul 10 23:36:11 2007 Subject: [xquery-talk] Deep-equal between sequences Message-ID: <46947AFE.80508@rpbourret.com> Hello, I have two sequences that I would like to compare. The sequences are composed of elements and the comparison is true if any member of the first sequence is deep-equal to any member of the second sequence. (Essentially, I want to do an = operation, but using deep-equal rather than atomic equality.) Can anybody think of a way to do this comparison with the = operator? This would give the query engine the chance to optimize the comparison, or at least to end the comparison early if a match was found. The problem is somewhat simplified by the fact that the definition of the elements is . Note that the following is not sufficient: let $firstSeq := ... let $secondSeq := ... return if (($firstSeq/A1 = $secondSeq/A1) and ($firstSeq/A2 = $secondSeq/A2) then fn:true() else fn:false() The problem with this is that it returns true if any pairs of A1 and A2 match, while I require that an A1 and A2 with the same parent in the first sequence match an A1 and A2 with the same parent in the second sequence. Barring use of the = operator, is there a simple solution to this problem that would allow the engine to stop processing on the first match, rather than performing n x m deep-equal comparisons and searching the resulting sequence for an instance of true? Thanks in advance, -- Ron From andrew.j.welch at gmail.com Wed Jul 11 10:33:36 2007 From: andrew.j.welch at gmail.com (Andrew Welch) Date: Wed Jul 11 01:31:13 2007 Subject: [xquery-talk] Deep-equal between sequences In-Reply-To: <46947AFE.80508@rpbourret.com> References: <46947AFE.80508@rpbourret.com> Message-ID: <74a894af0707110133q46b9637fse9dc5c8fcf6b2cc2@mail.gmail.com> On 7/11/07, Ronald Bourret wrote: > Hello, > > I have two sequences that I would like to compare. The sequences are > composed of elements and the comparison is true if any member of the > first sequence is deep-equal to any member of the second sequence. > (Essentially, I want to do an = operation, but using deep-equal rather > than atomic equality.) > > Can anybody think of a way to do this comparison with the = operator? > This would give the query engine the chance to optimize the comparison, > or at least to end the comparison early if a match was found. The > problem is somewhat simplified by the fact that the definition of the > elements is . > > Note that the following is not sufficient: > > let $firstSeq := ... > let $secondSeq := ... > return > if (($firstSeq/A1 = $secondSeq/A1) and > ($firstSeq/A2 = $secondSeq/A2) > then fn:true() > else fn:false() > > The problem with this is that it returns true if any pairs of A1 and A2 > match, while I require that an A1 and A2 with the same parent in the > first sequence match an A1 and A2 with the same parent in the second > sequence. > > Barring use of the = operator, is there a simple solution to this > problem that would allow the engine to stop processing on the first > match, rather than performing n x m deep-equal comparisons and searching > the resulting sequence for an instance of true? Do you just need: return $firstSeq[for $x in $secondSeq return deep-equal(., $x)] ...add a [1] to get the first. cheers andrew -- http://andrewjwelch.com From davidc at nag.co.uk Wed Jul 11 12:08:11 2007 From: davidc at nag.co.uk (David Carlisle) Date: Wed Jul 11 03:05:52 2007 Subject: [xquery-talk] Deep-equal between sequences In-Reply-To: <46947AFE.80508@rpbourret.com> (message from Ronald Bourret on Tue, 10 Jul 2007 23:38:54 -0700) References: <46947AFE.80508@rpbourret.com> Message-ID: <200707111008.l6BA8BKN028658@edinburgh.nag.co.uk> some $i in $firstSeq satisfies $secondSeq[deep-equal(.,$i)] ? David ________________________________________________________________________ The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is: Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom. This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. ________________________________________________________________________ From mike at saxonica.com Wed Jul 11 12:41:26 2007 From: mike at saxonica.com (Michael Kay) Date: Wed Jul 11 03:39:08 2007 Subject: [xquery-talk] Deep-equal between sequences In-Reply-To: <200707111008.l6BA8BKN028658@edinburgh.nag.co.uk> References: <46947AFE.80508@rpbourret.com> <200707111008.l6BA8BKN028658@edinburgh.nag.co.uk> Message-ID: <00c801c7c3a8$08ea66d0$6601a8c0@turtle> > > some $i in $firstSeq satisfies $secondSeq[deep-equal(.,$i)] > Certainly this meets the requirement, and certainly a half-decent implementation will exit when it finds the first "true" value. But in the worst case, without a very clever bit of optimization, it will result in N*M deep-equal comparisons. I saw another requirement like this recently - essentially grouping where grouping keys were compared using deep equality. It suggests to me a need for a function deep-comparison-key(node(), collation) -> xs:atomicValue where the result of the function is undefined except to the extent that deep-comparison-key(N, C) eq deep-comparison-key(M, C) if and only if deep-equal(N, M, C). (One possible implementation would be to serialize into canonical XML and then replace all strings with a collation key). The downside is that the semantics of deep-equal are themselves so fragile - the function so often doesn't perform exactly the comparison you would like. Another solution you could try to implement at application level would be to define a hash function such that deep-equal(A, B) => hash(A) eq hash(B), and then confine the deep-equal() comparisons to nodes where the hash values are equal. A good start might be hash($N) == string($N). Not very effective where most of the information is held in attributes, but OK in most other cases. Michael Kay From mike at saxonica.com Wed Jul 11 13:49:51 2007 From: mike at saxonica.com (Michael Kay) Date: Wed Jul 11 04:50:05 2007 Subject: [xquery-talk] Deep-equal between sequences In-Reply-To: <00c801c7c3a8$08ea66d0$6601a8c0@turtle> References: <46947AFE.80508@rpbourret.com><200707111008.l6BA8BKN028658@edinburgh.nag.co.uk> <00c801c7c3a8$08ea66d0$6601a8c0@turtle> Message-ID: <00e701c7c3b1$97ae9fe0$6601a8c0@turtle> > > > > > some $i in $firstSeq satisfies $secondSeq[deep-equal(.,$i)] > > > > Certainly this meets the requirement, and certainly a > half-decent implementation will exit when it finds the first > "true" value. But in the worst case, without a very clever > bit of optimization, it will result in N*M deep-equal comparisons. Come to think of it, it might not be as bad as I thought. The worst case, of doing n^2 deep-equal comparisons, happens when there are no deep-equal pairs, that is, when all the comparisons return false. But when two nodes are not deep-equal, the deep-equal function is likely in most cases to discover this quite quickly: the worst-case performance for deep-equal is when the function returns true. So there could be a situation here that really performs badly, for example when all the nodes are deep-equal in their first 998 children and differ in the 999th, but the cases likely to arise in practice are probably not too bad. It's still O(n^2) though, and can easily be made better. Michael Kay http://www.saxonica.com/ From rpbourret at rpbourret.com Wed Jul 11 17:12:59 2007 From: rpbourret at rpbourret.com (Ronald Bourret) Date: Wed Jul 11 16:09:54 2007 Subject: [xquery-talk] Deep-equal between sequences In-Reply-To: <200707111008.l6BA8BKN028658@edinburgh.nag.co.uk> References: <46947AFE.80508@rpbourret.com> <200707111008.l6BA8BKN028658@edinburgh.nag.co.uk> Message-ID: <469563FB.8090904@rpbourret.com> Thanks. I keep forgetting about some and every. -- Ron David Carlisle wrote: > some $i in $firstSeq satisfies $secondSeq[deep-equal(.,$i)] From rpbourret at rpbourret.com Wed Jul 11 17:14:13 2007 From: rpbourret at rpbourret.com (Ronald Bourret) Date: Wed Jul 11 16:11:09 2007 Subject: [xquery-talk] Deep-equal between sequences In-Reply-To: <00c801c7c3a8$08ea66d0$6601a8c0@turtle> References: <46947AFE.80508@rpbourret.com> <200707111008.l6BA8BKN028658@edinburgh.nag.co.uk> <00c801c7c3a8$08ea66d0$6601a8c0@turtle> Message-ID: <46956445.8090607@rpbourret.com> Michael Kay wrote: > The downside is that the semantics of deep-equal are themselves so fragile - > the function so often doesn't perform exactly the comparison you would like. Which is my case -- I don't actually need deep-equal, but something close to it. It was just the easiest way to explain the problem. > Another solution you could try to implement at application level would be to > define a hash function such that deep-equal(A, B) => hash(A) eq hash(B), and > then confine the deep-equal() comparisons to nodes where the hash values are > equal. A good start might be hash($N) == string($N). Not very effective > where most of the information is held in attributes, but OK in most other > cases. Another good idea. -- Ron From rpbourret at rpbourret.com Fri Jul 20 22:39:41 2007 From: rpbourret at rpbourret.com (Ronald Bourret) Date: Fri Jul 20 21:38:37 2007 Subject: [xquery-talk] Modules and duplicate declarations Message-ID: <46A18E0D.8050905@rpbourret.com> Hello, I am having trouble understanding how to import the same module in multiple places. The test query requires modules A and B, and module A requires module B. Therefore, test imports A and B, and A imports B: alib.xqy -------- module namespace a = "a"; import module namespace b = "b" at "..."; declare variable $a:a1 := fn:concat("a", $b:b1); blib.xqy -------- module namespace b = "b"; declare variable $b:b1 := "b"; test.xqy -------- import module namespace a = "a" at "..."; import module namespace b = "b" at "..."; ($a:a1, $b:b1) Running test returns XQST0049, stating that $b:b1 was declared twice. If test doesn't import blib.xqy, it returns XPST0008, stating that prefix b has not been declared. I don't understand the first error. The XQuery spec states that module imports are not transitive. That is, if A imports B and B imports C, C is not visible to A. Thus, $b:b1 should be visible to test only by directly importing blib.xqy and not by importing alib.xqy. Is this a bug in the processor (Saxon 8.6.1) or am I misreading the spec? Thanks, -- Ron From mike at saxonica.com Sat Jul 21 12:44:47 2007 From: mike at saxonica.com (Michael Kay) Date: Sat Jul 21 03:44:56 2007 Subject: [xquery-talk] Modules and duplicate declarations In-Reply-To: <46A18E0D.8050905@rpbourret.com> References: <46A18E0D.8050905@rpbourret.com> Message-ID: <008a01c7cb84$288a6ab0$6601a8c0@turtle> It seems to work fine on Saxon 8.9.0.4 8.6.1 is pretty old (Nov 2005), and both the product and the spec have moved on since it was released. Michael Kay http://www.saxonica.com/ > -----Original Message----- > From: talk-bounces@x-query.com > [mailto:talk-bounces@x-query.com] On Behalf Of Ronald Bourret > Sent: 21 July 2007 05:40 > To: talk@xquery.com > Subject: [xquery-talk] Modules and duplicate declarations > > Hello, > > I am having trouble understanding how to import the same > module in multiple places. > > The test query requires modules A and B, and module A > requires module B. > Therefore, test imports A and B, and A imports B: > > alib.xqy > -------- > module namespace a = "a"; > import module namespace b = "b" at "..."; declare variable > $a:a1 := fn:concat("a", $b:b1); > > blib.xqy > -------- > module namespace b = "b"; > declare variable $b:b1 := "b"; > > test.xqy > -------- > import module namespace a = "a" at "..."; import module > namespace b = "b" at "..."; ($a:a1, $b:b1) > > Running test returns XQST0049, stating that $b:b1 was > declared twice. If test doesn't import blib.xqy, it returns > XPST0008, stating that prefix b has not been declared. > > I don't understand the first error. The XQuery spec states > that module imports are not transitive. That is, if A imports > B and B imports C, C is not visible to A. Thus, $b:b1 should > be visible to test only by directly importing blib.xqy and > not by importing alib.xqy. > > Is this a bug in the processor (Saxon 8.6.1) or am I > misreading the spec? > > Thanks, > > -- Ron > > _______________________________________________ > talk@x-query.com > http://x-query.com/mailman/listinfo/talk