[xquery-talk] What tool is better for constructing an XQuery parser?

Per Bothner per at bothner.com
Sun Jan 16 23:23:36 PST 2005


Zhimao Guo wrote:

> JavaCC, or Java-CUP, or something else?

My general preference has been hand-written recursive-descent parsers. 
I'm not alone in this: The Gcc project (which knows a thing or two about 
compilers) has re-written various parsers from bison/Yacc to recursive 
descent.  There are some advantages:
* Flexibility: Real-world languages, including XQuery, have tricky 
context-dependencies or interactions between the lexer and parser.  So 
that nice syntax you see in your language specification will in practice 
  turn out to be a lot more complicated.  E.g. "for" can be a path 
expression, or the start of a FLWOR.  "declare" can be a path expression 
depending on what it is followed by.
* Speed: This is important for Gcc, though not necessarily all parsers.
* Understandability/debuggability: I can look at the parser, and a 
debugger can step through it.  The debugger stack trace is meaningful. 
If there is a bug, I can track it down without going through strange 
translations.  There is no extra mysterious layer of buffering making it 
harder to synchronize the lexer and parser.

Modern tools like JavaCC and Antlr generate top-down parsers, as 
oppoosed to the botttom-up parsers created by Yacc/Bison.  This does 
make a big improvement in the understandability/debuggability aspect, 
and provides some improvement with the flexibility aspect.
-- 
	--Per Bothner
per at bothner.com   http://per.bothner.com/


More information about the talk mailing list