[xquery-talk] Sparksoniq 0.9.1 Spruce: first alpha release

Ghislain Fourny gfourny at inf.ethz.ch
Wed Jan 24 05:51:56 PST 2018


Dear all,

I am happy to announce the first alpha release of Sparksoniq: 0.9.1 Spruce, under an Apache 2.0 license.

You can try it out with its shell and its documentation on

  http://sparksoniq.org


In a nutshell, this is a JSONiq engine [see below what it has to do with XQuery and XML] that runs seamlessly on top of Spark. This engine was developed at ETH Zurich in collaboration with Stefan Irimescu, who wrote his Master's thesis with Gustavo Alonso and me last year and did an amazing job.

The core idea is that FLWOR expressions (identical to XQuery’s) very naturally map to Spark transformations, which allows a declarative and functional encapsulation, hiding Spark, Java and Scala from the user. This is consistent with Edgar Codd’s data independence requirement. This is also to put in context with several other ongoing works on the XML side such as Apache VXQuery (on Hyrack), PAXQuery (on Flink), etc.

Keep it mind that this is an early version with not yet the full language, and we will do best effort to address all the bugs that will certainly be found, while keeping a specific focus on querying large-scale JSON datasets. We look forward to constructive comments, bug reports, feature wish lists, missing aspects in the documentation, etc.

With many thanks and kind regards,
Ghislain



I am adding a small note on JSONiq to give some context:

JSONiq, which some of you already know, is XQuery 3.0’s little brother and a cousin of XQuery 3.1. JSONiq’s DNA is 95% XQuery and it has all the expression machinery of XQuery, but adapted to specifically query JSON documents in a document-store-like setting, so as to be appealing to those in the JSON community who feel a bit uneasy about angle brackets, QNames, URIs and processing instructions.

It was designed by a few working group members as a proposal during the discussions on how to integrate objects and arrays into XQuery. It is still alive and used, because its use case, querying JSON data natively, is different and complementary to that of XQuery 3.1, which focuses more on the data structure aspects of objects and arrays. In particular, JSON is a subset of JSONiq, the data model fully mirrors JSON, navigation uses the dots JavaScript users are familiar with, etc. the difference is documented more in details here [1].



[1] https://stackoverflow.com/questions/44919443/what-are-the-differences-between-jsoniq-and-xquery-3-1




More information about the talk mailing list