[xquery-talk] Multiple output via a stream of filters
wcandillon at gmail.com
Tue Jan 14 00:56:03 PST 2014
Here are a couple of resources about streaming xml and json documents
As I'm reading your use case, I have a feeling that the transform.xq
library from John Snelson at https://github.com/jpcs/transform.xq
might hopefully be implementing the pattern you are looking for.
On Tue, Jan 14, 2014 at 9:42 AM, Michael Kay <mike at saxonica.com> wrote:
>> I only read the ginormous XML once... I apply the 7 filters to each
>> node read and it gets allocated to one of the 7 output buckets (hows
>> that for a semantically neutral term).
> This is known within the XSL WG as the "coloured widgets" problem after a streaming use case put forward by Oliver Becker. (The problem is, given an input document containing widgets of different colours, produce N output documents, one for each colour present in the file. There are two variants of the problem, one where the set of colours is known statically, one where it is dynamic). The XSLT 3.0 streaming solution for the static case is:
> <xsl:stream href="widgets.xml">
> <xsl:result-document href="red.xml">
> <xsl:sequence select="*/widget[@colour='red']"/>
> <xsl:result-document href="blue.xml">
> <xsl:sequence select="*/widget[@colour='blue']"/>
> <xsl:result-document href="green.xml">
> <xsl:sequence select="*/widget[@colour='green']"/>
> A streaming processor is required to evaluate this in a single pass of the input document; the three "prongs" of the xsl:fork are effectively executed in parallel.
> I mention this purely for academic interest, since there is no implementation available, unless you count the one I wrote last week.
> I don't think XSLT 3.0 currently has an equivalent solution for the dynamic case, where the colours are not known in advance. The normal solution would use "group-by" but this is not streamable.
> Michael Kay
> talk at x-query.com
More information about the talk