[xquery-talk] File Systems & XQuery

Michael Kay mike at saxonica.com
Wed Feb 7 15:13:47 PST 2007

Kristoffer Rose of IBM Research has demonstrated this kind of idea, but I
can't find a reference at the moment.

Michael Kay


> -----Original Message-----
> From: talk-bounces at x-query.com 
> [mailto:talk-bounces at x-query.com] On Behalf Of Frans Englich
> Sent: 07 February 2007 14:15
> To: talk at x-query.com
> Subject: [xquery-talk] File Systems & XQuery
> Hello all,
> When writing XML applications, one currently needs glue and 
> helper utilities to compensate for missing parts, or tie 
> steps together. This could be piping the result of a schema 
> validation to a transformation step, or determining what 
> files to open in an XQuery query. XProc[1] is one attempt to 
> solve these kind of problems.
> For navigating and inspecting the file system from XQuery, 
> one approach is that of Saxon's[2], where a mini-homebrewn 
> query is passed as an URI to fn:collection().
> I see some drawbacks with that approach:
> * It invents a new "language", instead of using XPath, and is 
> therefore limited, comparatively.
> * The result is not expressed with the XPath Data Model and 
> therefore one cannot use it as such; for example, transform 
> it with a stylesheet.
> I'm here venting the idea of another approach to inspecting 
> the file system:
> An absolute URI would be passed to fn:collection(). It would 
> always be the same regardless of what files being queried, 
> just like a namespace.  
> fn:collection() would in turn return a node that represents 
> the root of the file system. In the case of MS Windows 
> platforms(and other platforms), the root node would be 
> virtual, containing drives as children.
> The returned node would mirror the file system, where each 
> node representing a file would have attributes such as 
> mimeType, fileSize, absolutePath, and so forth. Since it's a 
> plain XDM node, the user has strong expressiveness with XPath.
> There are certain design issues with this, such as how the 
> XML format would be. This, for instance, is very friendly 
> from a query-writing perspective:
> declare variable $fs := fn:collection("http://fs-xquery.fs.net/");
> $fs/home/frans/xmlExamples//*[@mimeType eq 'application/xml']
> However, since this use dynamic elements, it's tricky to 
> express with a Schema and considered by many as bad 
> design(which I would agree with, but I do think it makes 
> query writing elegant).
> The alternative is rather messy for query writing:
> $fs/directory[@name =
> "home"]/directory[@name="frans"]/directory[@name="xmlExamples"
> ]//*[@mimeType
> eq 'application/xml']
> Either alternative is equally horribly, but in their own way. 
> Is there a third alternative? Can they be combined? Is any 
> alternative acceptable?
> Many parts of this would be implementation defined(such as 
> mime type detection and pretty much everything else). One 
> issue is node stability, especially when put in relation to 
> changes to the file system.
> Such a "mini"-spec could have two levels of conformance: for 
> statically typed and not. For typed implementations, 
> fn:collection's return value would have a more specific 
> return type, instead of "element()".
> The XML format would be in a namespace. Should that namespace 
> equal the collection URI? That's simple, but maybe there's 
> some issue that I not see.
> Is this idea overkill? Useful? Doable? If it's possible, I 
> think it would be an elegant use of the XPath Data Model's 
> abstraction to the underlying representation, and an 
> interoperable mechanism to query the filesystem. I believe it 
> would render many small scripts and Makefiles redundant.
> If it's possible, I consider writing up a draft for this, but 
> some initial input would be appreciated!
> 		Frans
> 1.
> http://www.w3.org/TR/xproc/
> 2.
> http://www.saxonica.com/documentation/sourcedocs/collections.html
> _______________________________________________
> talk at x-query.com
> http://x-query.com/mailman/listinfo/talk

More information about the talk mailing list