From adam.retter at googlemail.com Sat Feb 1 06:48:26 2014 From: adam.retter at googlemail.com (Adam Retter) Date: Sat, 1 Feb 2014 14:48:26 +0000 Subject: [xquery-talk] Sequences In-Reply-To: <508024A8-6DCB-40F1-A1FF-4A34952133E1@macleanfogg.com> References: <9aba523dfb7646079d11d55de4463c53@BY2PR08MB014.namprd08.prod.outlook.com> <508024A8-6DCB-40F1-A1FF-4A34952133E1@macleanfogg.com> Message-ID: On 1 February 2014 00:13, Misztur, Chris wrote: > So can I ever get at $seq[3][2]() ? > > (1,2,(function(),function()),5,6) > For something like that, your best bet would be Arrays which are coming in XQuery 3.1 (most likely). For the time being you could use maps of sequences, or there are tricks you can do with sequences of functions and even maps of functions, or if you really want fun, then just functions of functions of functions of... However if you don't want to loose your mind and you can get away with it, then I would just create an XML representation of an array, and a module of functions to manage update, insert, delete etc. -- Adam Retter skype: adam.retter tweet: adamretter http://www.adamretter.org.uk From gandhi.mukul at gmail.com Sat Feb 1 12:37:12 2014 From: gandhi.mukul at gmail.com (Mukul Gandhi) Date: Sun, 2 Feb 2014 02:07:12 +0530 Subject: [xquery-talk] Sequences In-Reply-To: References: Message-ID: you're right. The answer to this question is 9. I've checked with two XPath 2.0 (i'm assuming XQuery 1.0) processors, and this answer is correct. On Sat, Feb 1, 2014 at 4:58 PM, David Lee wrote: > You would be mistaken > Try it > > > Sent from my iPhone > > On Feb 1, 2014, at 1:41 AM, "Mukul Gandhi" wrote: > > I think, this is sequence of 7 items. As far as I can recall, the > internal sequence (4,5,6) won't be expanded to get the size of outer > sequence and would contribute one item to this result. > > On Sat, Feb 1, 2014 at 5:25 AM, Misztur, Chris wrote: > >> Is this a sequence of 7 or 9 items? >> >> ( >> 1, >> 2, >> 3, >> (4,5,6), >> 7, >> 8, >> 9 >> ) >> >> ________________________________ >> >> The contents of this message may be privileged and confidential. >> Therefore, if this message has been received in error, please delete it >> without reading it. Your receipt of this message is not intended to waive >> any applicable privilege. Please do not disseminate this message without >> the permission of the author. >> >> Please consider the environment before printing this e-mail >> > -- Regards, Mukul Gandhi -------------- next part -------------- An HTML attachment was scrubbed... URL: From STAMMW at de.ibm.com Sat Feb 1 15:33:21 2014 From: STAMMW at de.ibm.com (Hermann Stamm-Wilbrandt) Date: Sun, 2 Feb 2014 00:33:21 +0100 Subject: [xquery-talk] Matrix Multiplication (JSONiq) Message-ID: Last Sylvester there was a thread on Matrix Multiplication in XQuery: http://markmail.org/message/7q7qnbbnjo7cljzv?q=list:com.x-query.talk+matrix +multiplication The biggest problem identified was that XQuery does not allow for efficient representation of 2+dimensional arrays. JSON does provide 2+dimensional arrays for free. I did not measure performance yet, but this JSONiq script looks very similar to what would be done in C: http://try.zorba.io/queries/xquery/jFd3Q8f82HuZGzcYDzQpdN4SdfY= declare variable $A := [ [1,2], [3,4], [5,6], [7,8] ]; declare variable $B := [ [1,2,3], [4,5,6] ]; [ for $i in 1 to count(jn:members($A)) return [ for $k in 1 to count(jn:members($B(1))) return fn:sum( for $j in 1 to count(jn:members($B)) return $A($i)($j) * $B($j)($k) ) ] ] And much simpler than in XSLT: http://rosettacode.org/wiki/Matrix_multiplication#XSLT_1.0: Mit besten Gruessen / Best wishes, Hermann Stamm-Wilbrandt Level 3 support for XML Compiler team and Fixpack team lead WebSphere DataPower SOA Appliances https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/ https://twitter.com/HermannSW/ http://stamm-wilbrandt.de/GraphvizFiddle/ ---------------------------------------------------------------------- IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschaeftsfuehrung: Dirk Wittkopp Sitz der Gesellschaft: Boeblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 From davidc at nag.co.uk Sat Feb 1 16:44:35 2014 From: davidc at nag.co.uk (David Carlisle) Date: Sun, 02 Feb 2014 00:44:35 +0000 Subject: [xquery-talk] Matrix Multiplication (JSONiq) In-Reply-To: References: Message-ID: <52ED94F3.3020200@nag.co.uk> On 01/02/2014 23:33, Hermann Stamm-Wilbrandt wrote: > > Last Sylvester there was a thread on Matrix Multiplication in > XQuery: > http://markmail.org/message/7q7qnbbnjo7cljzv?q=list:com.x-query.talk+matrix+multiplication > > > > The biggest problem identified was that XQuery does not allow for > efficient representation of 2+dimensional arrays. > well it does, but as in other languages that only really have 1-D arrays (or languages such as C or Fortran where 2D arrays are a thin veneer over 1D arrays), you need to store a 2 D array as a 1D array with an additional integer giving the stride or leading dimension (in row or column order). To keep the arrays self contained I stored this as the first item in each sequence in the example below. > JSON does provide 2+dimensional arrays for free. > > I did not measure performance yet, but this JSONiq script looks very > similar to what would be done in C: > http://try.zorba.io/queries/xquery/jFd3Q8f82HuZGzcYDzQpdN4SdfY= > > declare variable $A := [ [1,2], [3,4], [5,6], [7,8] ]; declare > variable $B := [ [1,2,3], [4,5,6] ]; > > [ for $i in 1 to count(jn:members($A)) return [ for $k in 1 to > count(jn:members($B(1))) return fn:sum( for $j in 1 to > count(jn:members($B)) return $A($i)($j) * $B($j)($k) ) ] ] > > In Xquery 1 you could do let $a:=(2, (:2 columns :) 1,2, 3,4, 5,6, 7,8), $b:=(3, (:3 columns :) 1,2,3, 4,5,6) return ( $b[1], for $i in 1 to xs:int((count($a) -1) div $a[1]), $j in 1 to xs:int($b[1]) return sum( for $k in 1 to xs:int($a[1]) return ($a[($i -1)*$a[1]+$k+1] * $b[($k -1)*$b[1]+$j+1]) ) ) which produces 3 (:3 columns :) 9 12 15 19 26 33 29 40 51 39 54 69 > And much simpler than in XSLT: > http://rosettacode.org/wiki/Matrix_multiplication#XSLT_1.0: > > As the above is in fact only Xpath 2, you could do the identical expression in XSLT 2 David From ihe.onwuka at gmail.com Sat Feb 1 19:50:44 2014 From: ihe.onwuka at gmail.com (Ihe Onwuka) Date: Sun, 2 Feb 2014 03:50:44 +0000 Subject: [xquery-talk] Matrix Multiplication (JSONiq) In-Reply-To: References: Message-ID: On Sat, Feb 1, 2014 at 11:33 PM, Hermann Stamm-Wilbrandt wrote: > > Last Sylvester there was a thread on Matrix Multiplication in XQuery: > http://markmail.org/message/7q7qnbbnjo7cljzv?q=list:com.x-query.talk+matrix > +multiplication > > The biggest problem identified was that XQuery does not allow for efficient > representation of 2+dimensional arrays. > > JSON does provide 2+dimensional arrays for free. > I gave an algorithm for representing n-dimensional arrays with a 1 dimensional array earlier in the thread. See my post of 31 Dec 2013 here. http://markmail.org/message/dkhw7ryirwyviqm3#query:+page:1+mid:ccipumgobpaljjao+state:results From mike at saxonica.com Sun Feb 2 13:13:46 2014 From: mike at saxonica.com (Michael Kay) Date: Sun, 2 Feb 2014 21:13:46 +0000 Subject: [xquery-talk] Matrix Multiplication (JSONiq) In-Reply-To: References: Message-ID: <15CE1E6E-95A8-4EEB-835F-631E08F5898A@saxonica.com> On 2 Feb 2014, at 20:33, jean-marc Mercier wrote: > N-dimensional representation of arrays are quite straightforward with XML too. Is there any incentive to expect better performances with a JSON matrix representation rather than an XML one ? > I think that if you had an XML schema for an XML representation of N-dimensional arrays, and if the XPath processor recognized that schema and used a custom tree representation for its instances, then arrays could be represented using XML just as efficiently as using JSONiq arrays. But if you use a general tree representation that allow any element names, namespaces, base URIs, mixed content, attributes, and all the other paraphernalia of XML, then it is likely to be significantly less efficient. For example: * XML is text-oriented, and using XML for numeric values invariably involves string-to-number conversion, which is expensive * Numeric subscripts when addressing XML (as in para[3]) are likely to have O(n) performance rather than constant performance, because the tree structure is likely to be optimized for scanning all the children rather than locating an individual child by its index. Michael Kay Saxonica From STAMMW at de.ibm.com Mon Feb 3 02:56:08 2014 From: STAMMW at de.ibm.com (Hermann Stamm-Wilbrandt) Date: Mon, 3 Feb 2014 11:56:08 +0100 Subject: [xquery-talk] Matrix Multiplication (JSONiq) In-Reply-To: <52ED94F3.3020200@nag.co.uk> References: <52ED94F3.3020200@nag.co.uk> Message-ID: Thanks for your XQuery 1-dimensional sample. There are 4 XQuery with JSONiq implemenations named on jsoniq.org: 28.io, Zorba, IBM Websphere DataPower Integration Appliance and Pascal XQuery engine It seems not to be that easy to measure runtime. Since http://try.zorba.io allows to share and run code I used that. The method I found was to place Zorba's datetime:current-time() in result sequence as first and last elements. And the matrix multiplications need to be executed often to result in measurable times (I did use 10.000.000). These are JSONiq (1) and your XQuery (2) implemenations: http://try.zorba.io/queries/xquery/vq+kL9tWK+jmntDZz0oxDcyrypA= http://try.zorba.io/queries/xquery/NIlfOIBmkdvt8+2zNmvM8Hf1+bo= The times reported are quite different although run on same processor: PT1.713634S (JSONiq) versus PT9.77805S (XQuery) Yes, that is only result for one processor. But I would assume even (much) bigger differences in case the matrix dimensions become bigger and not toy like as in the examples. (1) import module namespace datetime = "http://www.zorba-xquery.com/modules/datetime"; declare variable $A := [ [1,2], [3,4], [5,6], [7,8] ]; declare variable $B := [ [1,2,3], [4,5,6] ]; declare variable $N := 10000000; let $R := ( datetime:current-time(), for $h in 1 to $N return [ for $i in 1 to count(jn:members($A)) return [ for $k in 1 to count(jn:members($B(1))) return fn:sum( for $j in 1 to count(jn:members($B)) return $A($i)($j) * $B($j)($k) ) ] ] , datetime:current-time() ) return $R[count($R)] - $R[1] (2) import module namespace datetime = "http://www.zorba-xquery.com/modules/datetime"; declare variable $a:=(2, (:2 columns :) 1,2, 3,4, 5,6, 7,8); declare variable $b:=(3, (:3 columns :) 1,2,3, 4,5,6); declare variable $N := 10000000; let $R := ( datetime:current-time(), for $h in 1 to $N return ( $b[1], for $i in 1 to xs:int((count($a) -1) div $a[1]), $j in 1 to xs:int($b[1]) return sum( for $k in 1 to xs:int($a[1]) return ($a[($i -1)*$a[1]+$k+1] * $b [($k -1)*$b[1]+$j+1]) ) ) , datetime:current-time() ) return $R[count($R)] - $R[1] Mit besten Gruessen / Best wishes, Hermann Stamm-Wilbrandt Level 3 support for XML Compiler team and Fixpack team lead WebSphere DataPower SOA Appliances https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/ https://twitter.com/HermannSW/ http://stamm-wilbrandt.de/GraphvizFiddle/ ---------------------------------------------------------------------- IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschaeftsfuehrung: Dirk Wittkopp Sitz der Gesellschaft: Boeblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 From: David Carlisle To: Hermann Stamm-Wilbrandt/Germany/IBM at IBMDE, talk at x-query.com, Date: 02/02/2014 01:44 AM Subject: Re: [xquery-talk] Matrix Multiplication (JSONiq) On 01/02/2014 23:33, Hermann Stamm-Wilbrandt wrote: > > Last Sylvester there was a thread on Matrix Multiplication in > XQuery > http://markmail.org/message/7q7qnbbnjo7cljzv?q=list:com.x-query.talk +matrix+multiplication > > > > The biggest problem identified was that XQuery does not allow for > efficient representation of 2+dimensional arrays. > well it does, but as in other languages that only really have 1-D arrays (or languages such as C or Fortran where 2D arrays are a thin veneer over 1D arrays), you need to store a 2 D array as a 1D array with an additional integer giving the stride or leading dimension (in row or column order). To keep the arrays self contained I stored this as the first item in each sequence in the example below. > JSON does provide 2+dimensional arrays for free. > > I did not measure performance yet, but this JSONiq script looks very > similar to what would be done in C: > http://try.zorba.io/queries/xquery/jFd3Q8f82HuZGzcYDzQpdN4SdfY= > > declare variable $A := [ [1,2], [3,4], [5,6], [7,8] ]; declare > variable $B := [ [1,2,3], [4,5,6] ]; > > [ for $i in 1 to count(jn:members($A)) return [ for $k in 1 to > count(jn:members($B(1))) return fn:sum( for $j in 1 to > count(jn:members($B)) return $A($i)($j) * $B($j)($k) ) ] ] > > In Xquery 1 you could do let $a:=(2, (:2 columns :) 1,2, 3,4, 5,6, 7,8), $b:=(3, (:3 columns :) 1,2,3, 4,5,6) return ( $b[1], for $i in 1 to xs:int((count($a) -1) div $a[1]), $j in 1 to xs:int($b[1]) return sum( for $k in 1 to xs:int($a[1]) return ($a[($i -1)*$a[1]+$k+1] * $b[($k -1)*$b[1]+$j+1]) ) ) which produces 3 (:3 columns :) 9 12 15 19 26 33 29 40 51 39 54 69 > And much simpler than in XSLT: > http://rosettacode.org/wiki/Matrix_multiplication#XSLT_1.0: > > As the above is in fact only Xpath 2, you could do the identical expression in XSLT 2 David From davidc at nag.co.uk Mon Feb 3 03:13:52 2014 From: davidc at nag.co.uk (David Carlisle) Date: Mon, 03 Feb 2014 11:13:52 +0000 Subject: [xquery-talk] Matrix Multiplication (JSONiq) In-Reply-To: References: <52ED94F3.3020200@nag.co.uk> Message-ID: <52EF79F0.8000206@nag.co.uk> On 03/02/2014 10:56, Hermann Stamm-Wilbrandt wrote: > PT1.713634S (JSONiq) versus PT9.77805S (XQuery) ooh interesting , I wonder where the bottleneck in the xquery is. Probably as Michael commented at some point earlier in the thread, the time to access the ith element of a sequence $a[$i]. But the language doesn't _need_ to change, just if more people did it the xquery compilers would perhaps look out for sequences that are exclusively accessed via numeric filters and implement them in a way that gives constant time access. Having a separate array type does give them a big hint though:-) David ________________________________________________________________________ The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is: Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom. This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. ________________________________________________________________________ From g at 28.io Mon Feb 3 04:55:58 2014 From: g at 28.io (Ghislain Fourny) Date: Mon, 3 Feb 2014 13:55:58 +0100 Subject: [xquery-talk] Matrix Multiplication (JSONiq) In-Reply-To: References: <52ED94F3.3020200@nag.co.uk> <52EF79F0.8000206@nag.co.uk> Message-ID: <30AEA396-C2D1-4E50-9B64-52AAD5414DD6@28.io> Hi, With a naive parallelism approach (on $i and $j), I think it's possible to bring it down to O(N) with a constant cost (in dollars), assuming "full cloud elasticity" (i.e., the number of instances that can be triggered up is not the bottleneck). The 28.io platform should supports this (disclaimer: it's my employer). Kind regards, Ghislain On 03 Feb 2014, at 12:30, jean-marc Mercier wrote: > Hi all, > > I've tried the following JSON query with zorba, mimicking a NxN, with N=200, matrix multiplications. Time is 10 sec on http://try.zorba.io/, behaving with a cubic N^3 complexitity. > Do you really want to know what are the performances of standard linear algebra library for such matrix multiplications ? > > > > > import module namespace datetime = "http://www.zorba-xquery.com/modules/datetime"; > > declare variable $size := 200; > > declare variable $A := [ for $i in 1 to $size return > [ > for $j in 1 to $size return $i*$size+$j > ] > ]; > > let $R := ( datetime:current-time(), > [ > for $i in 1 to count(jn:members($A)) return > [ > for $k in 1 to count(jn:members($A)) return > fn:sum( > for $j in 1 to count(jn:members($A)) return > $A($i)($j) * $A($j)($k) > ) > ] > ] > , datetime:current-time() ) > > return $R[count($R)] - $R[1] > > > 2014-02-03 David Carlisle : > On 03/02/2014 10:56, Hermann Stamm-Wilbrandt wrote: > PT1.713634S (JSONiq) versus PT9.77805S (XQuery) > > > ooh interesting , I wonder where the bottleneck in the xquery is. > Probably as Michael commented at some point earlier in the thread, the > time to access the ith element of a sequence $a[$i]. > > > But the language doesn't _need_ to change, just if more people did it > the xquery compilers would perhaps look out for sequences that are > exclusively accessed via numeric filters and implement them in a way > that gives constant time access. Having a separate array type does give > them a big hint though:-) > > David > > > ________________________________________________________________________ > The Numerical Algorithms Group Ltd is a company registered in England > and Wales with company number 1249803. The registered office is: > Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom. > > This e-mail has been scanned for all viruses by Star. The service is > powered by MessageLabs. ________________________________________________________________________ > _______________________________________________ > talk at x-query.com > http://x-query.com/mailman/listinfo/talk > > _______________________________________________ > talk at x-query.com > http://x-query.com/mailman/listinfo/talk From christian.gruen at gmail.com Mon Feb 3 06:35:55 2014 From: christian.gruen at gmail.com (=?ISO-8859-1?Q?Christian_Gr=FCn?=) Date: Mon, 3 Feb 2014 15:35:55 +0100 Subject: [xquery-talk] Matrix Multiplication (JSONiq) In-Reply-To: References: <52ED94F3.3020200@nag.co.uk> Message-ID: Hi Hermann, thanks for your interesting comparison. I'd like to point out that the loop around the calculation (for $h in 1 to $N) might be optimized away by a query compiler. If this happens, the only thing that would be measured in the query would be the concatenation of 10 million results. But I completely agree that it's difficult to formulate an example that is easy to compare if as long as we have no dynamic input. Christian ______________________________________________ On Mon, Feb 3, 2014 at 11:56 AM, Hermann Stamm-Wilbrandt wrote: > Thanks for your XQuery 1-dimensional sample. > > There are 4 XQuery with JSONiq implemenations named on jsoniq.org: > 28.io, Zorba, IBM Websphere DataPower Integration Appliance and Pascal > XQuery engine > > It seems not to be that easy to measure runtime. > Since http://try.zorba.io allows to share and run code I used that. > The method I found was to place Zorba's datetime:current-time() in result > sequence as first and last elements. > And the matrix multiplications need to be executed often to result in > measurable times (I did use 10.000.000). > > These are JSONiq (1) and your XQuery (2) implemenations: > http://try.zorba.io/queries/xquery/vq+kL9tWK+jmntDZz0oxDcyrypA= > http://try.zorba.io/queries/xquery/NIlfOIBmkdvt8+2zNmvM8Hf1+bo= > > The times reported are quite different although run on same processor: > PT1.713634S (JSONiq) versus PT9.77805S (XQuery) > > Yes, that is only result for one processor. But I would assume even (much) > bigger differences in case the matrix dimensions become bigger and not toy > like as in the examples. > > > (1) > import module namespace datetime = > "http://www.zorba-xquery.com/modules/datetime"; > > declare variable $A := [ > [1,2], > [3,4], > [5,6], > [7,8] > ]; > declare variable $B := [ > [1,2,3], > [4,5,6] > ]; > declare variable $N := 10000000; > > let $R := ( datetime:current-time(), > for $h in 1 to $N return > [ > for $i in 1 to count(jn:members($A)) return > [ > for $k in 1 to count(jn:members($B(1))) return > fn:sum( > for $j in 1 to count(jn:members($B)) return > $A($i)($j) * $B($j)($k) > ) > ] > ] > , datetime:current-time() ) > > return $R[count($R)] - $R[1] > > (2) > import module namespace datetime = > "http://www.zorba-xquery.com/modules/datetime"; > > declare variable $a:=(2, (:2 columns :) 1,2, 3,4, 5,6, 7,8); > declare variable $b:=(3, (:3 columns :) 1,2,3, 4,5,6); > declare variable $N := 10000000; > > let $R := ( datetime:current-time(), > for $h in 1 to $N return > ( $b[1], for $i in 1 to xs:int((count($a) -1) div $a[1]), $j in 1 to > xs:int($b[1]) return > sum( for $k in 1 to xs:int($a[1]) return ($a[($i -1)*$a[1]+$k+1] * $b > [($k -1)*$b[1]+$j+1]) ) ) > , datetime:current-time() ) > > return $R[count($R)] - $R[1] > > > Mit besten Gruessen / Best wishes, > > Hermann Stamm-Wilbrandt > Level 3 support for XML Compiler team and Fixpack team lead > WebSphere DataPower SOA Appliances > https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/ > https://twitter.com/HermannSW/ > http://stamm-wilbrandt.de/GraphvizFiddle/ > ---------------------------------------------------------------------- > IBM Deutschland Research & Development GmbH > Vorsitzende des Aufsichtsrats: Martina Koederitz > Geschaeftsfuehrung: Dirk Wittkopp > Sitz der Gesellschaft: Boeblingen > Registergericht: Amtsgericht Stuttgart, HRB 243294 > > > > From: David Carlisle > > To: Hermann Stamm-Wilbrandt/Germany/IBM at IBMDE, talk at x-query.com, > > Date: 02/02/2014 01:44 AM > > Subject: Re: [xquery-talk] Matrix Multiplication (JSONiq) > > > > > > > On 01/02/2014 23:33, Hermann Stamm-Wilbrandt wrote: >> >> Last Sylvester there was a thread on Matrix Multiplication in >> XQuery >> http://markmail.org/message/7q7qnbbnjo7cljzv?q=list:com.x-query.talk > +matrix+multiplication >> >> > >> >> The biggest problem identified was that XQuery does not allow for >> efficient representation of 2+dimensional arrays. >> > > well it does, but as in other languages that only really have 1-D arrays > (or languages such as C or Fortran where 2D arrays are a thin veneer > over 1D arrays), you need to store a 2 D array as a 1D array with an > additional integer giving the stride or leading dimension (in row or > column order). To keep the arrays self contained I stored this as the > first item in each sequence in the example below. > > > > >> JSON does provide 2+dimensional arrays for free. >> >> I did not measure performance yet, but this JSONiq script looks very >> similar to what would be done in C: >> http://try.zorba.io/queries/xquery/jFd3Q8f82HuZGzcYDzQpdN4SdfY= >> >> declare variable $A := [ [1,2], [3,4], [5,6], [7,8] ]; declare >> variable $B := [ [1,2,3], [4,5,6] ]; >> >> [ for $i in 1 to count(jn:members($A)) return [ for $k in 1 to >> count(jn:members($B(1))) return fn:sum( for $j in 1 to >> count(jn:members($B)) return $A($i)($j) * $B($j)($k) ) ] ] >> >> > > In Xquery 1 you could do > > > let $a:=(2, (:2 columns :) > 1,2, > 3,4, > 5,6, > 7,8), > > $b:=(3, (:3 columns :) > 1,2,3, > 4,5,6) > > return > > ( > $b[1], > for $i in 1 to xs:int((count($a) -1) div $a[1]), > $j in 1 to xs:int($b[1]) > return > sum( > for $k in 1 to xs:int($a[1]) return > ($a[($i -1)*$a[1]+$k+1] * $b[($k -1)*$b[1]+$j+1]) > ) > ) > > > which produces > > 3 (:3 columns :) > 9 12 15 > 19 26 33 > 29 40 51 > 39 54 69 > > > >> And much simpler than in XSLT: >> http://rosettacode.org/wiki/Matrix_multiplication#XSLT_1.0: >> >> > > As the above is in fact only Xpath 2, you could do the identical > expression in XSLT 2 > > > David > > > > > _______________________________________________ > talk at x-query.com > http://x-query.com/mailman/listinfo/talk From STAMMW at de.ibm.com Mon Feb 3 07:24:41 2014 From: STAMMW at de.ibm.com (Hermann Stamm-Wilbrandt) Date: Mon, 3 Feb 2014 16:24:41 +0100 Subject: [xquery-talk] Matrix Multiplication (JSONiq) In-Reply-To: References: <52ED94F3.3020200@nag.co.uk> <52EF79F0.8000206@nag.co.uk> Message-ID: Hi, as other I do see neither JSONiq nor XQuery being able to compete with eg. native C implemenation of matrix multiplication. Your 200x200 example is interesting as it shows that JSONiq(JSON, PT10.838593S) is now slower than XQuery(XML, PT9.03348S): http://try.zorba.io/queries/xquery/elMs5bmRLr%2FZ0IHY5mr6YNWqgjI%3D http://try.zorba.io/queries/xquery/FhH%2BPs3wjNB2xwTw%2BchwwMes2dw%3D Mit besten Gruessen / Best wishes, Hermann Stamm-Wilbrandt Level 3 support for XML Compiler team and Fixpack team lead WebSphere DataPower SOA Appliances https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/ https://twitter.com/HermannSW/ http://stamm-wilbrandt.de/GraphvizFiddle/ ---------------------------------------------------------------------- IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschaeftsfuehrung: Dirk Wittkopp Sitz der Gesellschaft: Boeblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 From: jean-marc Mercier To: David Carlisle , Cc: Hermann Stamm-Wilbrandt/Germany/IBM at IBMDE, "talk at x-query.com Talk" Date: 02/03/2014 12:30 PM Subject: Re: [xquery-talk] Matrix Multiplication (JSONiq) Hi all, I've tried the following JSON query with zorba, mimicking a NxN, with N=200, matrix multiplications. Time is 10 sec on http://try.zorba.io/, behaving with a cubic N^3 complexitity. Do you really want to know what are the performances of standard linear algebra library for such matrix multiplications ? import module namespace datetime = " http://www.zorba-xquery.com/modules/datetime"; declare variable $size := 200; declare variable $A := [ for $i in 1 to $size return ? ? [ ? ? ? ? for $j in 1 to $size return $i*$size+$j ? ? ] ]; let $R := ( datetime:current-time(), ? [ ? ? for $i in 1 to count(jn:members($A)) return ? ? [ ? ? ? for $k in 1 to count(jn:members($A)) return ? ? ? ? fn:sum( ? ? ? ? ? for $j in 1 to count(jn:members($A)) return ? ? ? ? ? ? $A($i)($j) * $A($j)($k) ? ? ? ? ) ? ? ] ? ] , datetime:current-time() ) return $R[count($R)] - $R[1] 2014-02-03 David Carlisle : On 03/02/2014 10:56, Hermann Stamm-Wilbrandt wrote: PT1.713634S (JSONiq) versus PT9.77805S (XQuery) ooh interesting , I wonder where the bottleneck in the xquery is. Probably as Michael commented at some point earlier in the thread, the time to access the ith element of a sequence $a[$i]. But the language doesn't _need_ to change, just if more people did it the xquery compilers would perhaps look out for sequences that are exclusively accessed via numeric filters and implement them in a way that gives constant time access. Having a separate array type does give them a big hint though:-) David ________________________________________________________________________ The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is: Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom. This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. ________________________________________________________________________ _______________________________________________ talk at x-query.com http://x-query.com/mailman/listinfo/talk From STAMMW at de.ibm.com Mon Feb 3 09:26:05 2014 From: STAMMW at de.ibm.com (Hermann Stamm-Wilbrandt) Date: Mon, 3 Feb 2014 18:26:05 +0100 Subject: [xquery-talk] Matrix Multiplication (JSONiq) In-Reply-To: References: <52ED94F3.3020200@nag.co.uk> Message-ID: Hi Christian, > I'd like to point out that the loop around the calculation (for $h in 1 to $N) might be optimized away by a query compiler. > agreed, but I did test with different values of $N and the reported times increased. The easiest way of disallowing the optimizer to kick in would be to compute the matrix A^{2^N} by N repeated matrix square operations, in that case nothing can be optimized away. Mit besten Gruessen / Best wishes, Hermann Stamm-Wilbrandt Level 3 support for XML Compiler team and Fixpack team lead WebSphere DataPower SOA Appliances https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/ https://twitter.com/HermannSW/ http://stamm-wilbrandt.de/GraphvizFiddle/ ---------------------------------------------------------------------- IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschaeftsfuehrung: Dirk Wittkopp Sitz der Gesellschaft: Boeblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 From: Christian Gr?n To: Hermann Stamm-Wilbrandt/Germany/IBM at IBMDE, Cc: David Carlisle , "talk at x-query.com" Date: 02/03/2014 03:36 PM Subject: Re: [xquery-talk] Matrix Multiplication (JSONiq) Hi Hermann, thanks for your interesting comparison. I'd like to point out that the loop around the calculation (for $h in 1 to $N) might be optimized away by a query compiler. If this happens, the only thing that would be measured in the query would be the concatenation of 10 million results. But I completely agree that it's difficult to formulate an example that is easy to compare if as long as we have no dynamic input. Christian ______________________________________________ On Mon, Feb 3, 2014 at 11:56 AM, Hermann Stamm-Wilbrandt wrote: > Thanks for your XQuery 1-dimensional sample. > > There are 4 XQuery with JSONiq implemenations named on jsoniq.org: > 28.io, Zorba, IBM Websphere DataPower Integration Appliance and Pascal > XQuery engine > > It seems not to be that easy to measure runtime. > Since http://try.zorba.io allows to share and run code I used that. > The method I found was to place Zorba's datetime:current-time() in result > sequence as first and last elements. > And the matrix multiplications need to be executed often to result in > measurable times (I did use 10.000.000). > > These are JSONiq (1) and your XQuery (2) implemenations: > http://try.zorba.io/queries/xquery/vq+kL9tWK+jmntDZz0oxDcyrypA= > http://try.zorba.io/queries/xquery/NIlfOIBmkdvt8+2zNmvM8Hf1+bo= > > The times reported are quite different although run on same processor: > PT1.713634S (JSONiq) versus PT9.77805S (XQuery) > > Yes, that is only result for one processor. But I would assume even (much) > bigger differences in case the matrix dimensions become bigger and not toy > like as in the examples. > > > (1) > import module namespace datetime = > "http://www.zorba-xquery.com/modules/datetime"; > > declare variable $A := [ > [1,2], > [3,4], > [5,6], > [7,8] > ]; > declare variable $B := [ > [1,2,3], > [4,5,6] > ]; > declare variable $N := 10000000; > > let $R := ( datetime:current-time(), > for $h in 1 to $N return > [ > for $i in 1 to count(jn:members($A)) return > [ > for $k in 1 to count(jn:members($B(1))) return > fn:sum( > for $j in 1 to count(jn:members($B)) return > $A($i)($j) * $B($j)($k) > ) > ] > ] > , datetime:current-time() ) > > return $R[count($R)] - $R[1] > > (2) > import module namespace datetime = > "http://www.zorba-xquery.com/modules/datetime"; > > declare variable $a:=(2, (:2 columns :) 1,2, 3,4, 5,6, 7,8); > declare variable $b:=(3, (:3 columns :) 1,2,3, 4,5,6); > declare variable $N := 10000000; > > let $R := ( datetime:current-time(), > for $h in 1 to $N return > ( $b[1], for $i in 1 to xs:int((count($a) -1) div $a[1]), $j in 1 to > xs:int($b[1]) return > sum( for $k in 1 to xs:int($a[1]) return ($a[($i -1)*$a[1]+$k+1] * $b > [($k -1)*$b[1]+$j+1]) ) ) > , datetime:current-time() ) > > return $R[count($R)] - $R[1] > > > Mit besten Gruessen / Best wishes, > > Hermann Stamm-Wilbrandt > Level 3 support for XML Compiler team and Fixpack team lead > WebSphere DataPower SOA Appliances > https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/ > https://twitter.com/HermannSW/ > http://stamm-wilbrandt.de/GraphvizFiddle/ > ---------------------------------------------------------------------- > IBM Deutschland Research & Development GmbH > Vorsitzende des Aufsichtsrats: Martina Koederitz > Geschaeftsfuehrung: Dirk Wittkopp > Sitz der Gesellschaft: Boeblingen > Registergericht: Amtsgericht Stuttgart, HRB 243294 > > > > From: David Carlisle > > To: Hermann Stamm-Wilbrandt/Germany/IBM at IBMDE, talk at x-query.com, > > Date: 02/02/2014 01:44 AM > > Subject: Re: [xquery-talk] Matrix Multiplication (JSONiq) > > > > > > > On 01/02/2014 23:33, Hermann Stamm-Wilbrandt wrote: >> >> Last Sylvester there was a thread on Matrix Multiplication in >> XQuery >> http://markmail.org/message/7q7qnbbnjo7cljzv?q=list:com.x-query.talk > +matrix+multiplication >> >> > >> >> The biggest problem identified was that XQuery does not allow for >> efficient representation of 2+dimensional arrays. >> > > well it does, but as in other languages that only really have 1-D arrays > (or languages such as C or Fortran where 2D arrays are a thin veneer > over 1D arrays), you need to store a 2 D array as a 1D array with an > additional integer giving the stride or leading dimension (in row or > column order). To keep the arrays self contained I stored this as the > first item in each sequence in the example below. > > > > >> JSON does provide 2+dimensional arrays for free. >> >> I did not measure performance yet, but this JSONiq script looks very >> similar to what would be done in C: >> http://try.zorba.io/queries/xquery/jFd3Q8f82HuZGzcYDzQpdN4SdfY= >> >> declare variable $A := [ [1,2], [3,4], [5,6], [7,8] ]; declare >> variable $B := [ [1,2,3], [4,5,6] ]; >> >> [ for $i in 1 to count(jn:members($A)) return [ for $k in 1 to >> count(jn:members($B(1))) return fn:sum( for $j in 1 to >> count(jn:members($B)) return $A($i)($j) * $B($j)($k) ) ] ] >> >> > > In Xquery 1 you could do > > > let $a:=(2, (:2 columns :) > 1,2, > 3,4, > 5,6, > 7,8), > > $b:=(3, (:3 columns :) > 1,2,3, > 4,5,6) > > return > > ( > $b[1], > for $i in 1 to xs:int((count($a) -1) div $a[1]), > $j in 1 to xs:int($b[1]) > return > sum( > for $k in 1 to xs:int($a[1]) return > ($a[($i -1)*$a[1]+$k+1] * $b[($k -1)*$b[1]+$j+1]) > ) > ) > > > which produces > > 3 (:3 columns :) > 9 12 15 > 19 26 33 > 29 40 51 > 39 54 69 > > > >> And much simpler than in XSLT: >> http://rosettacode.org/wiki/Matrix_multiplication#XSLT_1.0: >> >> > > As the above is in fact only Xpath 2, you could do the identical > expression in XSLT 2 > > > David > > > > > _______________________________________________ > talk at x-query.com > http://x-query.com/mailman/listinfo/talk From ihe.onwuka at gmail.com Fri Feb 7 02:25:13 2014 From: ihe.onwuka at gmail.com (Ihe Onwuka) Date: Fri, 7 Feb 2014 10:25:13 +0000 Subject: [xquery-talk] What does [.] do. In-Reply-To: <8860FF79-9AD0-42D0-A2EE-1938E2DE7E11@saxonica.com> References: <229FA6E3-A3C2-4CBE-9418-08F6051C5717@28.io> <8860FF79-9AD0-42D0-A2EE-1938E2DE7E11@saxonica.com> Message-ID: On Mon, Jan 27, 2014 at 2:45 PM, Michael Kay wrote: > > I think the only case I've used in anger is probably count(tokenize($x, ' ')[.]) which eliminates the zero-length tokens that can arise at the start and/or end of the sequence. > slight variation tokenize($url,'/')[.][last()] that gets me the last bit of a url irrespective of whether it ends with a / From ihe.onwuka at gmail.com Sun Feb 9 23:14:04 2014 From: ihe.onwuka at gmail.com (Ihe Onwuka) Date: Mon, 10 Feb 2014 07:14:04 +0000 Subject: [xquery-talk] min max and mix Message-ID: max((80,9)) -> 80 Probably works as designed but it does raise the question of the utility of min and max in the presence of mixed content (i.e in the case where you expect 9 because it is > than 8) - or are all bets off if the content is mixed. No - it is not an actual use case - I was actually trying to establish something else but stumbled across this in the process. From mike at saxonica.com Mon Feb 10 00:17:13 2014 From: mike at saxonica.com (Michael Kay) Date: Mon, 10 Feb 2014 08:17:13 +0000 Subject: [xquery-talk] min max and mix In-Reply-To: References: Message-ID: <0BFA443A-A378-4624-9802-806CA9EEFB83@saxonica.com> That's nothing to with max() and min(), but everything to do with mixed content. The system is designed so that 123.456 has an untyped string value of 123.456 Remember, the M in XML stands for Markup. Markup is an annotation to text that can be removed without changing the meaning. Michael Kay Saxonica On 10 Feb 2014, at 07:14, Ihe Onwuka wrote: > max((80,9)) -> 80 > > Probably works as designed but it does raise the question of the > utility of min and max in the presence of mixed content (i.e in the > case where you expect 9 because it is > than 8) - or are all bets off > if the content is mixed. > > No - it is not an actual use case - I was actually trying to establish > something else but stumbled across this in the process. > _______________________________________________ > talk at x-query.com > http://x-query.com/mailman/listinfo/talk From ihe.onwuka at gmail.com Mon Feb 10 00:46:11 2014 From: ihe.onwuka at gmail.com (Ihe Onwuka) Date: Mon, 10 Feb 2014 08:46:11 +0000 Subject: [xquery-talk] min max and mix In-Reply-To: <0BFA443A-A378-4624-9802-806CA9EEFB83@saxonica.com> References: <0BFA443A-A378-4624-9802-806CA9EEFB83@saxonica.com> Message-ID: On Mon, Feb 10, 2014 at 8:17 AM, Michael Kay wrote: > > Remember, the M in XML stands for Markup. Markup is an annotation to text that can be removed without changing the meaning. > Cripes. It's not that far fetched that 8233 might mean element b (total 8) consists of sub elements of d , e and f that contribute 2, 3 and 3 respectively to whoever designed it. From ihe.onwuka at gmail.com Mon Feb 10 01:59:56 2014 From: ihe.onwuka at gmail.com (Ihe Onwuka) Date: Mon, 10 Feb 2014 09:59:56 +0000 Subject: [xquery-talk] What does [.] do. In-Reply-To: <229FA6E3-A3C2-4CBE-9418-08F6051C5717@28.io> References: <229FA6E3-A3C2-4CBE-9418-08F6051C5717@28.io> Message-ID: On Mon, Jan 27, 2014 at 1:59 PM, Ghislain Fourny wrote: > Hi Ihe, > > You are right that it is a filter expression. > > However, I think [.] is not very common in "real world" code, except maybe for very precise use cases (like filtering out empty strings, etc). Usually you would put either a position or a boolean predicate inside a filter expression -- not just a context item expression. > > What [.] does, if I am not missing anything, is that it only keeps: > 1. Numerics equal to their position in the left-hand-side sequence > and > 2. Non-numerics that have an Effective Boolean Value of true, like non-empty strings, nodes, the true boolean, etc. > > Example: > (1, 2, 4, 3, 5, "", "foo", , true, false)[.] > > returns > > 1 (position matches) > 2 (position matches) > 5 (position matches) > foo (EBV = true) > (EBV = true) > true (EBV = true) > In Zorba it does - but is that right? In Saxon 9.3.0.5 it gives Error on line 1 of *module with no systemId*: XPDY0002: The context item for axis step child::true is undefined The context item for axis step child::true is undefined In eXist it gives err:XPDY0002 Undefined context sequence for 'child::{}true' [at line 1, column 34, source: String] From mike at saxonica.com Mon Feb 10 03:14:09 2014 From: mike at saxonica.com (Michael Kay) Date: Mon, 10 Feb 2014 11:14:09 +0000 Subject: [xquery-talk] What does [.] do. In-Reply-To: References: <229FA6E3-A3C2-4CBE-9418-08F6051C5717@28.io> Message-ID: On 10 Feb 2014, at 09:59, Ihe Onwuka wrote: > On Mon, Jan 27, 2014 at 1:59 PM, Ghislain Fourny wrote: >> Hi Ihe, >> >> You are right that it is a filter expression. >> >> However, I think [.] is not very common in "real world" code, except maybe for very precise use cases (like filtering out empty strings, etc). Usually you would put either a position or a boolean predicate inside a filter expression -- not just a context item expression. >> >> What [.] does, if I am not missing anything, is that it only keeps: >> 1. Numerics equal to their position in the left-hand-side sequence >> and >> 2. Non-numerics that have an Effective Boolean Value of true, like non-empty strings, nodes, the true boolean, etc. >> >> Example: >> (1, 2, 4, 3, 5, "", "foo", , true, false)[.] >> >> returns >> >> 1 (position matches) >> 2 (position matches) >> 5 (position matches) >> foo (EBV = true) >> (EBV = true) >> true (EBV = true) >> > > In Zorba it does - but is that right? It's right if there is a context item, which is context-dependent.... For true and false in this example, you probably meant true() and false(). Michael Kay Saxoinca > > In Saxon 9.3.0.5 it gives > > Error on line 1 of *module with no systemId*: > XPDY0002: The context item for axis step child::true is undefined > The context item for axis step child::true is undefined > > In eXist it gives > > err:XPDY0002 Undefined context sequence for 'child::{}true' [at line > 1, column 34, source: String] > > _______________________________________________ > talk at x-query.com > http://x-query.com/mailman/listinfo/talk From ihe.onwuka at gmail.com Mon Feb 10 03:20:24 2014 From: ihe.onwuka at gmail.com (Ihe Onwuka) Date: Mon, 10 Feb 2014 11:20:24 +0000 Subject: [xquery-talk] What does [.] do. In-Reply-To: References: <229FA6E3-A3C2-4CBE-9418-08F6051C5717@28.io> Message-ID: On Mon, Feb 10, 2014 at 11:14 AM, Michael Kay wrote: > > On 10 Feb 2014, at 09:59, Ihe Onwuka wrote: > >> On Mon, Jan 27, 2014 at 1:59 PM, Ghislain Fourny wrote: >>> Hi Ihe, >>> >>> You are right that it is a filter expression. >>> >>> However, I think [.] is not very common in "real world" code, except maybe for very precise use cases (like filtering out empty strings, etc). Usually you would put either a position or a boolean predicate inside a filter expression -- not just a context item expression. >>> >>> What [.] does, if I am not missing anything, is that it only keeps: >>> 1. Numerics equal to their position in the left-hand-side sequence >>> and >>> 2. Non-numerics that have an Effective Boolean Value of true, like non-empty strings, nodes, the true boolean, etc. >>> >>> Example: >>> (1, 2, 4, 3, 5, "", "foo", , true, false)[.] >>> >>> returns >>> >>> 1 (position matches) >>> 2 (position matches) >>> 5 (position matches) >>> foo (EBV = true) >>> (EBV = true) >>> true (EBV = true) >>> >> >> In Zorba it does - but is that right? > > It's right if there is a context item, which is context-dependent.... > > For true and false in this example, you probably meant true() and false(). > No. I am aware of that distinction. I was quoting the results obtained from the example that was quoted. ihe at ihe-ThinkPad-T410:~/film$ zorba -q '(1, 2, 4, 3, 5, "", "foo", , true, false)[.]' (no URI):1,41: static warning [zwarn:ZWST0008]: "false": has been deprecated; use "fn:false()" instead (no URI):1,35: static warning [zwarn:ZWST0008]: "true": has been deprecated; use "fn:true()" instead 1 2 5 footruei From dlee at calldei.com Mon Feb 10 04:05:59 2014 From: dlee at calldei.com (David Lee) Date: Mon, 10 Feb 2014 12:05:59 +0000 Subject: [xquery-talk] min max and mix In-Reply-To: References: <0BFA443A-A378-4624-9802-806CA9EEFB83@saxonica.com> Message-ID: <3dba2439a5254180a03ce082fc2a5e6d@BY2PR08MB014.namprd08.prod.outlook.com> ==================================================== Cripes. It's not that far fetched that 8233 might mean element b (total 8) consists of sub elements of d , e and f that contribute 2, 3 and 3 respectively to whoever designed it. ==================================================== Its not far fetched that it also means 8 *c(2) + d(3) + e(4) Its not far fetched but its not what it *actually* means in XML languages (particularly this comes from the rule of how atomization of XML works). There is defined default behaviour then then is 'whaterver you want behaviour'. You actually have it both ... you are free to parse the XML and assign whatever meaning you want. In general, I submit, one should be careful about presuming what things *might mean* in languages (computer, human, and biological). It is fun to speculate but one is usually totally wrong. Now your use case could be *coerced* to mean what you say but it's not the default defined behavior. It's not farfetched that in C "a" + "b" + "c" == "abc" but its not actually true. _______________________________________________ talk at x-query.com http://x-query.com/mailman/listinfo/talk From ihe.onwuka at gmail.com Mon Feb 10 04:18:37 2014 From: ihe.onwuka at gmail.com (Ihe Onwuka) Date: Mon, 10 Feb 2014 12:18:37 +0000 Subject: [xquery-talk] min max and mix In-Reply-To: <3dba2439a5254180a03ce082fc2a5e6d@BY2PR08MB014.namprd08.prod.outlook.com> References: <0BFA443A-A378-4624-9802-806CA9EEFB83@saxonica.com> <3dba2439a5254180a03ce082fc2a5e6d@BY2PR08MB014.namprd08.prod.outlook.com> Message-ID: On Mon, Feb 10, 2014 at 12:05 PM, David Lee wrote: > ==================================================== > Cripes. It's not that far fetched that > > 8233 > > might mean > > element b (total 8) consists of sub elements of d , e and f that contribute 2, 3 and 3 respectively to whoever designed it. > ==================================================== > Its not far fetched that it also means > 8 *c(2) + d(3) + e(4) > > Its not far fetched but its not what it *actually* means in XML languages (particularly this comes from the rule of how atomization of XML works). > > > There is defined default behaviour then then is 'whaterver you want behaviour'. You actually have it both ... you are free > to parse the XML and assign whatever meaning you want. > > In general, I submit, one should be careful about presuming what things *might mean* in languages (computer, human, and biological). > It is fun to speculate but one is usually totally wrong. > > Now your use case could be *coerced* to mean what you say but it's not the default defined behavior. > > It's not farfetched that in C > > "a" + "b" + "c" == "abc" > > but its not actually true. > Personally I avoid mixed content models wherever possible. So it is more of an issue for those that don't. I was just messing about with these functions to see whether they were robust with respect to stability (that is stability as in a stable sort). From dlee at calldei.com Mon Feb 10 04:23:10 2014 From: dlee at calldei.com (David Lee) Date: Mon, 10 Feb 2014 12:23:10 +0000 Subject: [xquery-talk] min max and mix In-Reply-To: References: <0BFA443A-A378-4624-9802-806CA9EEFB83@saxonica.com> <3dba2439a5254180a03ce082fc2a5e6d@BY2PR08MB014.namprd08.prod.outlook.com> Message-ID: <00a4d8f0f8554f07a1020e574a358a26@BY2PR08MB014.namprd08.prod.outlook.com> Personally I avoid mixed content models wherever possible. So it is more of an issue for those that don't. I was just messing about with these functions to see whether they were robust with respect to stability (that is stability as in a stable sort). [DAL:] ============== Even neglecting mixed content ( dont use that for data !) ... functions that expect atomics or lists of atomics generally dont do what you want when you give it *nested* XML. sum( 12 ) == 12 Its nice enough to try to simple content elements but not nested ("complex") elements. You *do* have to be careful with this. I wouldnt classify it as "stability" but its a rational argument. Dont be putting stuff in functions that you dont know whats in there ... and expect to get something out that you expect. ---------------------------------------- David A. Lee dlee at calldei.com http://www.xmlsh.org From ihe.onwuka at gmail.com Mon Feb 10 04:36:46 2014 From: ihe.onwuka at gmail.com (Ihe Onwuka) Date: Mon, 10 Feb 2014 12:36:46 +0000 Subject: [xquery-talk] min max and mix In-Reply-To: <00a4d8f0f8554f07a1020e574a358a26@BY2PR08MB014.namprd08.prod.outlook.com> References: <0BFA443A-A378-4624-9802-806CA9EEFB83@saxonica.com> <3dba2439a5254180a03ce082fc2a5e6d@BY2PR08MB014.namprd08.prod.outlook.com> <00a4d8f0f8554f07a1020e574a358a26@BY2PR08MB014.namprd08.prod.outlook.com> Message-ID: On Mon, Feb 10, 2014 at 12:23 PM, David Lee wrote: > > Personally I avoid mixed content models wherever possible. So it is more of an issue for those that don't. > > I was just messing about with these functions to see whether they were robust with respect to stability (that is stability as in a stable sort). > [DAL:] ============== > > Even neglecting mixed content ( dont use that for data !) > ... functions that expect atomics or lists of atomics generally dont do what you want when you give it *nested* XML. > > sum( 12 ) == 12 > hmmmm I see. > Its nice enough to try to simple content elements but not nested ("complex") elements. > > You *do* have to be careful with this. I wouldnt classify it as "stability" but its a rational argument. > Dont be putting stuff in functions that you dont know whats in there ... and expect to get something out that you expect. > The stability thing is not a use case of mine either. Was just interested to see what min/max would do when confronted with multiple nodes that evaluated to the same value and to see if I could trace which one it picked. There you've dragged the wretched truth out of me. From dlee at calldei.com Mon Feb 10 04:43:03 2014 From: dlee at calldei.com (David Lee) Date: Mon, 10 Feb 2014 12:43:03 +0000 Subject: [xquery-talk] Text Markup vs Data Serialization - Was RE: min max and mix Message-ID: <54ac80df25dd4b2396a32960dcdde1fd@BY2PR08MB014.namprd08.prod.outlook.com> Possibly better discussed on xml mailing lists ... but ... This thread has me thinking ... That XML, while originally a form of Text markup (you start with text and add Markup) is of dual use as Data Serialization. *Even in the same document* ... This can be confusing but its also powerful. My opinion is that the compromises made to allow this "Dual Use", while not perfect and not quite equal in each use case, are really amazing. I cannot think of any other markup or serialization format which does better at accommodating both use cases as equal citizens reasonably well. So much so that with XML you can come from a Data background and rarely run into anything awful (sometimes unexpected like the min/max thing), or you can come from a Text/Document background and never even imagine that your documents could be considered "data" (you're not going to run sum() on a paragraph ... ) AND you can come from both hats at once and intermix and overlay the concepts ... if your clever enough :) (e.g. you might run count() on the *words* in a text document or add data tables to a text document or add rich text to a data document). Not trying to start a markup war, just reflecting on the philosophy that is embedded in XML and its tools. _______________________________________________ talk at x-query.com http://x-query.com/mailman/listinfo/talk From davidc at nag.co.uk Mon Feb 10 04:45:34 2014 From: davidc at nag.co.uk (David Carlisle) Date: Mon, 10 Feb 2014 12:45:34 +0000 Subject: [xquery-talk] min max and mix In-Reply-To: References: <0BFA443A-A378-4624-9802-806CA9EEFB83@saxonica.com> <3dba2439a5254180a03ce082fc2a5e6d@BY2PR08MB014.namprd08.prod.outlook.com> <00a4d8f0f8554f07a1020e574a358a26@BY2PR08MB014.namprd08.prod.outlook.com> Message-ID: <52F8C9EE.9080202@nag.co.uk> On 10/02/2014 12:36, Ihe Onwuka wrote: > > The stability thing is not a use case of mine either. Was just > interested to see what min/max would do when confronted with > multiple nodes that evaluated to the same value and to see if I could > trace which one it picked. > > There you've dragged the wretched truth out of me. > _______________________________________________ This is documented as implementation dependent, so any experiments are limited in value to that particular implementation version. > Selects an item from the input sequence $arg whose value is greater > than or equal to the value of every other item in the input sequence. > If there are two or more such items, then the specific item whose > value is returned is ?implementation dependent?. http://www.w3.org/TR/xpath-functions/#func-max David From ihe.onwuka at gmail.com Mon Feb 10 04:51:44 2014 From: ihe.onwuka at gmail.com (Ihe Onwuka) Date: Mon, 10 Feb 2014 12:51:44 +0000 Subject: [xquery-talk] min max and mix In-Reply-To: <52F8C9EE.9080202@nag.co.uk> References: <0BFA443A-A378-4624-9802-806CA9EEFB83@saxonica.com> <3dba2439a5254180a03ce082fc2a5e6d@BY2PR08MB014.namprd08.prod.outlook.com> <00a4d8f0f8554f07a1020e574a358a26@BY2PR08MB014.namprd08.prod.outlook.com> <52F8C9EE.9080202@nag.co.uk> Message-ID: On Mon, Feb 10, 2014 at 12:45 PM, David Carlisle wrote: > On 10/02/2014 12:36, Ihe Onwuka wrote: > >> >> The stability thing is not a use case of mine either. Was just >> interested to see what min/max would do when confronted with >> multiple nodes that evaluated to the same value and to see if I could >> trace which one it picked. >> >> There you've dragged the wretched truth out of me. >> _______________________________________________ > > > > This is documented as implementation dependent, so any experiments are > limited in value to that particular implementation version. > >> Selects an item from the input sequence $arg whose value is greater >> than or equal to the value of every other item in the input sequence. >> If there are two or more such items, then the specific item whose >> value is returned is ?implementation dependent?. > > > > http://www.w3.org/TR/xpath-functions/#func-max > ...and now you have exposed what motivated the experiment. From dlee at calldei.com Mon Feb 10 04:53:35 2014 From: dlee at calldei.com (David Lee) Date: Mon, 10 Feb 2014 12:53:35 +0000 Subject: [xquery-talk] min max and mix In-Reply-To: <52F8C9EE.9080202@nag.co.uk> References: <0BFA443A-A378-4624-9802-806CA9EEFB83@saxonica.com> <3dba2439a5254180a03ce082fc2a5e6d@BY2PR08MB014.namprd08.prod.outlook.com> <00a4d8f0f8554f07a1020e574a358a26@BY2PR08MB014.namprd08.prod.outlook.com> <52F8C9EE.9080202@nag.co.uk> Message-ID: <72a7e461a360463187dad453bc38af2a@BY2PR08MB014.namprd08.prod.outlook.com> ======= This is documented as implementation dependent, so any experiments are limited in value to that particular implementation version. > Selects an item from the input sequence $arg whose value is greater > than or equal to the value of every other item in the input sequence. > If there are two or more such items, then the specific item whose > value is returned is *implementation dependent*. http://www.w3.org/TR/xpath-functions/#func-max ============ [DAL:] Even if it were not system dependent there is no defined way to take an atomic value and figure out which node it came from, so, IMHO, the statement of implementation dependency is only there because its also non-testable ... even given this ... max( ( 1.0 , 1.00 ) ) => 1 a or b ? good luck. This is really a fancy way of saying implementations can apply the atomization *before* the comparison and dont have to keep track of where it all came from. ---------------------------------------- David A. Lee dlee at calldei.com http://www.xmlsh.org From ihe.onwuka at gmail.com Mon Feb 10 04:57:13 2014 From: ihe.onwuka at gmail.com (Ihe Onwuka) Date: Mon, 10 Feb 2014 12:57:13 +0000 Subject: [xquery-talk] min max and mix In-Reply-To: <72a7e461a360463187dad453bc38af2a@BY2PR08MB014.namprd08.prod.outlook.com> References: <0BFA443A-A378-4624-9802-806CA9EEFB83@saxonica.com> <3dba2439a5254180a03ce082fc2a5e6d@BY2PR08MB014.namprd08.prod.outlook.com> <00a4d8f0f8554f07a1020e574a358a26@BY2PR08MB014.namprd08.prod.outlook.com> <52F8C9EE.9080202@nag.co.uk> <72a7e461a360463187dad453bc38af2a@BY2PR08MB014.namprd08.prod.outlook.com> Message-ID: On Mon, Feb 10, 2014 at 12:53 PM, David Lee wrote: > ======= > > This is documented as implementation dependent, so any experiments are limited in value to that particular implementation version. > >> Selects an item from the input sequence $arg whose value is greater >> than or equal to the value of every other item in the input sequence. >> If there are two or more such items, then the specific item whose >> value is returned is *implementation dependent*. > > > http://www.w3.org/TR/xpath-functions/#func-max > ============ > [DAL:] > Even if it were not system dependent there is no defined way to take an atomic value and figure out which node it came from, > so, IMHO, the statement of implementation dependency is only there because its also non-testable ... > even given this ... > max( ( 1.0 , 1.00 ) ) => 1 > > a or b ? good luck. > ...again exactly why I was curious... (used to be a tester), your test case is better than the one I tried. From ihe.onwuka at gmail.com Mon Feb 10 04:58:27 2014 From: ihe.onwuka at gmail.com (Ihe Onwuka) Date: Mon, 10 Feb 2014 12:58:27 +0000 Subject: [xquery-talk] Text Markup vs Data Serialization - Was RE: min max and mix In-Reply-To: <54ac80df25dd4b2396a32960dcdde1fd@BY2PR08MB014.namprd08.prod.outlook.com> References: <54ac80df25dd4b2396a32960dcdde1fd@BY2PR08MB014.namprd08.prod.outlook.com> Message-ID: On Mon, Feb 10, 2014 at 12:43 PM, David Lee wrote: > > Not trying to start a markup war, just reflecting on the philosophy that is embedded in XML and its tools. > Look. By all means start a markup war. Just make sure that the OP don't get blamed for it. From mike at saxonica.com Mon Feb 10 11:04:44 2014 From: mike at saxonica.com (Michael Kay) Date: Mon, 10 Feb 2014 19:04:44 +0000 Subject: [xquery-talk] min max and mix In-Reply-To: <72a7e461a360463187dad453bc38af2a@BY2PR08MB014.namprd08.prod.outlook.com> References: <0BFA443A-A378-4624-9802-806CA9EEFB83@saxonica.com> <3dba2439a5254180a03ce082fc2a5e6d@BY2PR08MB014.namprd08.prod.outlook.com> <00a4d8f0f8554f07a1020e574a358a26@BY2PR08MB014.namprd08.prod.outlook.com> <52F8C9EE.9080202@nag.co.uk> <72a7e461a360463187dad453bc38af2a@BY2PR08MB014.namprd08.prod.outlook.com> Message-ID: <7CDB717A-25F9-46DC-8F77-F59CF5250AD7@saxonica.com> > Even if it were not system dependent there is no defined way to take an atomic value and figure out which node it came from, > so, IMHO, the statement of implementation dependency is only there because its also non-testable ... > even given this ... > max( ( 1.0 , 1.00 ) ) => 1 > > a or b ? good luck. > min() and max() return an atomized value so you can't tell which node it came from anyway. The implementation-dependency has more to do with mixed floats, doubles, and decimals: min((1.0, 1.0e0)) = the answer will be equal to one, but it's implementation-dependent whether it's a decimal one or a double one. (IIRC, haven't checked the spec). Michael Kay Saxonica From mike at saxonica.com Mon Feb 10 11:11:56 2014 From: mike at saxonica.com (Michael Kay) Date: Mon, 10 Feb 2014 19:11:56 +0000 Subject: [xquery-talk] Text Markup vs Data Serialization - Was RE: min max and mix In-Reply-To: <54ac80df25dd4b2396a32960dcdde1fd@BY2PR08MB014.namprd08.prod.outlook.com> References: <54ac80df25dd4b2396a32960dcdde1fd@BY2PR08MB014.namprd08.prod.outlook.com> Message-ID: <661839D4-EF89-4323-B911-9DC1FA5F1DE7@saxonica.com> > > My opinion is that the compromises made to allow this "Dual Use", while not perfect and not quite equal in each use case, are really amazing. > I cannot think of any other markup or serialization format which does better at accommodating both use cases as equal citizens reasonably well. XML does a good job at this but it leaves some well-known problems. I tried to do better in FtanML (Balisage 2013). For example FtanML: * allows element and attribute values that are typed as integers, booleans, or sequences of anything without resorting to a schema for example married=true, height=1.86, children=["John", "Mary"] * distinguishes whitespace that is present for readability purposes from whitespace that's part of the content * eliminates the artificial distinction between elements and attributes, allowing the same values to be held in both Michael Kay Saxonica From dlee at calldei.com Mon Feb 10 11:14:55 2014 From: dlee at calldei.com (David Lee) Date: Mon, 10 Feb 2014 19:14:55 +0000 Subject: [xquery-talk] Text Markup vs Data Serialization - Was RE: min max and mix In-Reply-To: <661839D4-EF89-4323-B911-9DC1FA5F1DE7@saxonica.com> References: <54ac80df25dd4b2396a32960dcdde1fd@BY2PR08MB014.namprd08.prod.outlook.com> <661839D4-EF89-4323-B911-9DC1FA5F1DE7@saxonica.com> Message-ID: <677be7bd30dc4980a43ef9318eff4603@BY2PR08MB014.namprd08.prod.outlook.com> I do like FtanML ... In fact look forward to a possible presentation at some upcoming conference with a few key concepts borrowed/stolen ... so how far have you gotten to get FtanML/XSLT/XPath/XQuery ? :) ( thats the thing about XML ... even with flaws it has an unsurpassed adopted toolchain ) -----Original Message----- From: Michael Kay [mailto:mike at saxonica.com] Sent: Monday, February 10, 2014 2:12 PM To: David Lee Cc: ihe.onwuka at gmail.com; talk at x-query.com Subject: Re: [xquery-talk] Text Markup vs Data Serialization - Was RE: min max and mix > > My opinion is that the compromises made to allow this "Dual Use", while not perfect and not quite equal in each use case, are really amazing. > I cannot think of any other markup or serialization format which does better at accommodating both use cases as equal citizens reasonably well. XML does a good job at this but it leaves some well-known problems. I tried to do better in FtanML (Balisage 2013). For example FtanML: * allows element and attribute values that are typed as integers, booleans, or sequences of anything without resorting to a schema for example married=true, height=1.86, children=["John", "Mary"] * distinguishes whitespace that is present for readability purposes from whitespace that's part of the content * eliminates the artificial distinction between elements and attributes, allowing the same values to be held in both Michael Kay Saxonica From mike at saxonica.com Mon Feb 10 11:15:00 2014 From: mike at saxonica.com (Michael Kay) Date: Mon, 10 Feb 2014 19:15:00 +0000 Subject: [xquery-talk] What does [.] do. In-Reply-To: References: <229FA6E3-A3C2-4CBE-9418-08F6051C5717@28.io> Message-ID: > > No. I am aware of that distinction. I was quoting the results obtained > from the example that was quoted. > > ihe at ihe-ThinkPad-T410:~/film$ zorba -q '(1, 2, 4, 3, 5, "", "foo", > , true, false)[.]' > (no URI):1,41: static warning [zwarn:ZWST0008]: "false": has been > deprecated; use "fn:false()" instead > (no URI):1,35: static warning [zwarn:ZWST0008]: "true": has been > deprecated; use "fn:true()" instead Those warnings appear to be telling you that Zorba is applying a non-standard meaning to the names true and false, and is warning you that it is doing so. If so you are right, it's not conformant. Michael Kay Saxonica From mike at saxonica.com Mon Feb 10 11:31:47 2014 From: mike at saxonica.com (Michael Kay) Date: Mon, 10 Feb 2014 19:31:47 +0000 Subject: [xquery-talk] Text Markup vs Data Serialization - Was RE: min max and mix In-Reply-To: <677be7bd30dc4980a43ef9318eff4603@BY2PR08MB014.namprd08.prod.outlook.com> References: <54ac80df25dd4b2396a32960dcdde1fd@BY2PR08MB014.namprd08.prod.outlook.com> <661839D4-EF89-4323-B911-9DC1FA5F1DE7@saxonica.com> <677be7bd30dc4980a43ef9318eff4603@BY2PR08MB014.namprd08.prod.outlook.com> Message-ID: <17801932-6B1B-4C2D-A23C-96CE8992E91E@saxonica.com> On 10 Feb 2014, at 19:14, David Lee wrote: > I do like FtanML ... In fact look forward to a possible presentation at some upcoming conference with a few key concepts borrowed/stolen ... > FtanML was very deliberately an exercise in answering the question "what would we like markup to look like if there were no compatibility / transition / adoption issues influencing the design?". It will have succeeded if it influences whatever comes next. As with other widely-adopted standards like SQL, Posix, and C, XML will be very hard to displace; unlike those standards it also seems to be very resistant to incremental improvement. We're currently in a position where the world has discovered better ways of serializing structured data, but hasn't yet discovered a better way of serializing narrative text or of information that mixes narrative text with structured data (which is the domain that I find most interesting). I've got a very bad track record at predicting the future, so I really shouldn't attempt it. Perhaps some standards group needing a new specification in some area like insurance will decide that it wants something better than XML and better than JSON and invent its own syntax, which will do the job sufficiently well that people in other areas start adopting it too. Who knows. Michael Kay Saxonica From per at bothner.com Mon Feb 10 16:54:32 2014 From: per at bothner.com (Per Bothner) Date: Mon, 10 Feb 2014 16:54:32 -0800 Subject: [xquery-talk] Text Markup vs Data Serialization - Was RE: min max and mix In-Reply-To: <17801932-6B1B-4C2D-A23C-96CE8992E91E@saxonica.com> References: <54ac80df25dd4b2396a32960dcdde1fd@BY2PR08MB014.namprd08.prod.outlook.com> <661839D4-EF89-4323-B911-9DC1FA5F1DE7@saxonica.com> <677be7bd30dc4980a43ef9318eff4603@BY2PR08MB014.namprd08.prod.outlook.com> <17801932-6B1B-4C2D-A23C-96CE8992E91E@saxonica.com> Message-ID: <52F974C8.80406@bothner.com> On 02/10/2014 11:31 AM, Michael Kay wrote: > We're currently in a position where the world has discovered better ways of serializing structured data, but hasn't yet discovered a better way of serializing narrative text or of information that mixes narrative text with structured data (which is the domain that I find most interesting). You might found interesting SRFI-108 http://srfi.schemers.org/srfi-108/srfi-108.html This defines a Scheme language extension for "Named quasi-literal constructors" which I intended to be useful for both structured data and rich test. XQuery's

Hello {$name}

would be represented as: &p{Hello &em[name]!} The difference is that SRFI-108 defines a *framework*, in that &p is syntactic sugar for a call to a function or macro $construct$:p; the meaning of the latter depends on whatever is in scope according to context. There is a related SRFI-109 http://srfi.schemers.org/srfi-109/srfi-109.html for extended multi-line string literals. Both use '&' for escapes, as in XML. I.e. character and entity references use the XML syntax, while an embedded expression uses &[...]. The following is equivalent to Java's ("Hello "+name+"!"): &{Hello &[name]!} The interesting this is that a simple string: &{Hello &[name]!} can be easily converted to rich text. For example: &p{Hello &[name]!} or: &p{Hello &em[name]!} assuming you have an HTML "vocabulary" in scope. There is also a related embedded-XML syntax SRFI-107 http://srfi.schemers.org/srfi-107/srfi-107.html This is a superset of XML, but uses the same syntax as SRFI-108/-109 for references and embedded expressions. Kawa 1.14 implements SRFI-107, SRFI-108, and SRFI-109: http://per.bothner.com/blog/2013/Kawa-1.14-released/ SRFI-108 defines a language embedding/extension (and specifically for Lisp-family languages), rather than a serialization/interchange format, but just like JSON one could define a subset or variant as a possible data format. -- --Per Bothner per at bothner.com http://per.bothner.com/ From christian.gruen at gmail.com Wed Feb 12 08:02:12 2014 From: christian.gruen at gmail.com (=?ISO-8859-1?Q?Christian_Gr=FCn?=) Date: Wed, 12 Feb 2014 17:02:12 +0100 Subject: [xquery-talk] [ANN] BaseX 7.8, The XMLPrague Edition Message-ID: Dear all, We are very pleased to announce Version 7.8 of BaseX (a.k.a. the XMLPrague Edition)! These are the features you can expect: * A new project view allows you to open, edit and manage your project files directly in the GUI and search files and contents in realtime. * The integrated editor provides many new short cuts and code completions for writing XQuery modules (http://docs.basex.org/wiki/Shortcuts). * Delete and insert operations are executed faster than ever before, values are updated in-place whenever possible, and a new convenience operator has been added for transform expressions (http://docs.basex.org/wiki/Updates#update). * XQuery functions are now inlined and further optimized. Tail call detection and static typing has been improved, and (sub)sequences are processed much faster. * Various XQuery Modules have been enhanced (JSON, CSV, Unit, Map, XQuery, Full-Text, EXPath File), and the EXPath Binary Module has been added (http://docs.basex.org/wiki/Module_Library). * BaseX is now available in Russian and Spanish. Thank you to Oleksandr Shpak and Carlos Marcos! We are looking forward to your feedback, and we hope to see many of you in Prague, Christian BaseX Team From ihe.onwuka at gmail.com Fri Feb 21 08:18:31 2014 From: ihe.onwuka at gmail.com (Ihe Onwuka) Date: Fri, 21 Feb 2014 16:18:31 +0000 Subject: [xquery-talk] Castable integers Message-ID: 3 castable as xs:integer -> true "3" castable as xs:integer -> true 3.5 castable as xs:integer -> true ..... hmmmmm debatable that one ok...(scratches head)....so if it's roundable to an integer it's castable to an integer fine "3.5" castable as xs:integer -> false Btw which list is the home for XPath stuff. From mike at saxonica.com Fri Feb 21 09:16:18 2014 From: mike at saxonica.com (Michael Kay) Date: Fri, 21 Feb 2014 17:16:18 +0000 Subject: [xquery-talk] [xsl] Castable integers In-Reply-To: References: Message-ID: Yes, castability is not transitive. Well spotted. You get some pretty strange results with boolean() too, e.g. boolean(string(false())) => true(). Michael Kay Saxonica On 21 Feb 2014, at 16:18, Ihe Onwuka wrote: > 3 castable as xs:integer -> true > > "3" castable as xs:integer -> true > > 3.5 castable as xs:integer -> true ..... hmmmmm debatable that one > > ok...(scratches head)....so if it's roundable to an integer it's > castable to an integer > > fine > > "3.5" castable as xs:integer -> false > > Btw which list is the home for XPath stuff. > > --~------------------------------------------------------------------ > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/ > or e-mail: > --~-- >