[xquery-talk] XQuery static typing algorithms?

Michael Kay mike at saxonica.com
Mon Nov 24 17:01:23 PST 2008


There's a useful paper by Henry Thompson:

http://www.ltg.ed.ac.uk/~ht/XML_Europe_2003.html

This algorithm is implemented very directly in the Saxon schema processor.
The hardest part is where he refers to Aho & Ullmann for determinizing the
FSA; the A&U algorithm requires very careful reading.

He also published a refinement to handle content models with numeric limits
other than 1 and infinity more efficiently:

http://xtech06.usefulinc.com/schedule/paper/118

I found that one tough to implement directly because he hand-waves about the
rules for determinizing the FSA, which seems to be the tough part. I ended
up doing my own thing with counters - which is highly dangerous in this
space, there is a tendency to find a fundamental flaw in your algorithm
after 3 years exposure in the field.

Michael Kay
http://www.saxonica.com/ 

> -----Original Message-----
> From: talk-bounces at x-query.com 
> [mailto:talk-bounces at x-query.com] On Behalf Of Per Bothner
> Sent: 24 November 2008 15:30
> To: talk at x-query.com
> Subject: Re: [xquery-talk] XQuery static typing algorithms?
> 
> Thanks to both Jens Teubner and Bas de Bakker for your suggestions!
> 
> Jens Teubner wrote:
> > My personal impression is that, if you want to be 
> standards-compliant, 
> > the best way to start is to literally implement all the 
> judgments in 
> > the W3C Formal Semantics.
> 
> In general that is good advice, but when it comes to 
> subtyping, then the Formal Semantics doesn't provide 
> judgements that are implementable, as far as I can tell:
> 
> 8.3.2 Subtyping (<:)
> 
> ...
> 
> Note
> 
> The above definition, although complete and precise, does not 
> give a simple means to compute subtyping. Notably the 
> definition above refers to values, which are not available at 
> static type checking time.
> 
> The structural component of the [XPath/XQuery] type system 
> can be modeled by regular expressions. Regular expressions 
> can be implemented by means of finite state automata. 
> Computing subtyping between two types can then be done by 
> computing if inclusion holds between their corresponding 
> finite state automata.
> 
> Finite state automata and how to compute operations on those 
> automata, such as inclusion, emptiness or intersection, have 
> been extensively studied and documented in the literature. 
> The interested reader can consult the relevant literature on 
> tree grammars, for instance [Languages], or [TATA].
> -- 
> 	--Per Bothner
> per at bothner.com   http://per.bothner.com/
> _______________________________________________
> talk at x-query.com
> http://x-query.com/mailman/listinfo/talk



More information about the talk mailing list