[xquery-talk] Nux-1.3 released
whoschek at lbl.gov
Wed Aug 3 12:33:51 PDT 2005
The Nux-1.3 release has been uploaded to
Nux is an open-source Java toolkit making efficient and powerful XML
• Upgraded to saxonb-8.5 (saxon-8.4 and 8.3 should continue
to work as well).
• Upgraded to xom-1.1-rc1 (with compatible performance
patches). Plain xom-1.0 should continue to work as well, albeit less
• Numerous bnux Binary XML performance enhancements for
serialization and deserialization (UTF-8 character encoding, buffer
management, symbol table, pack sorting, cache locality, etc).
Overall, bnux is now about twice as fast, and, perhaps more
importantly, has a much more uniform performance profile, no matter
what kind of document flavour is thrown at it. It routinely delivers
50-100 MB/sec deserialization performance, and 30-70 MB/sec
serialization performance (commodity PC 2004). It is roughly 5-10
times faster than xom-1.1 with xerces-2.7.1 (which, in turn, is
faster than saxonb-8.5, dom4j-1.6.1 and xerces-2.7.1 DOM). Further,
preliminary measurements indicate bnux deserialization and
serialization to be consistently 2-3 times faster than Sun's
FastInfoSet implementation, using XOM. Saxon's PTree could not be
tested as it is only available in the commercial version. The only
remaining area with substantial potential for performance improvement
seems to be complex namespace handling. This might be addressed by
slightly restructuring private XOM internals in a future version.
• BinaryXMLTest now also has command line support for testing
and benchmarking Saxon, DOM and FastInfoSet (besides bnux and XOM).
• Rewrote XQueryCommand. The new nux/bin/fire-xquery is a
more powerful, flexible and reliable command line test tool that runs
a given XQuery against a set of files and prints the result sequence.
In addition, it supports schema validation, XInclude (via XOM), an
XQuery update facility, malformed HTML parsing (via TagSoup) and much
more. It's available for Unix and Windows, and works like any other
decent Unix command line tool.
• Removed ValidationCommand (made obsolete by the fire-xquery
• Added experimental XQuery in-place update functionality.
Comments on the usefulness of the current behaviour are especially
welcome, as are suggestions for potential improvements.
• Added nux.xom.xquery.ResultSequenceSerializer, which
serializes an XQuery/XPath2 result sequence onto a given output
stream, using various configurable serialization options such
encoding and indentation. Implements the W3C XQuery/XSLT2
Serialization Draft Spec. Also implements an alternative wrapping
algorithm that ensures that any arbitrary result sequence can always
be output as a well-formed XML document.
• Added XQueryFactory.createXQuery(File file, URI baseURI)
and XQueryPool.getXQuery(File file, URI baseURI) to allow for
separation of the location of the query file and input XML files.
• The default XQuery DocumentURIResolver now recognizes the
".bnux" file extension as binary XML, and parses it accordingly. For
example, a query can be 'doc("samples/data/articles.xml.bnux")/
• Added FileUtil.listFiles(). Returns the URIs of all files
who's path matches at least one of the given inclusion wildcard or
regular expressions but none of the given exclusion wildcard or
regular expressions; starting from the given directory, optionally
with recursive directory traversal, insensitive to underlying
operating system conventions.
• XOMUtil.Normalizer now uses XML whitespace definition
rather than Java whitespace definition.
• Added XOMUtil.Normalizer.STRIP, which removes Texts that
consist of whitespace-only (boundary whitespace), retaining other
• Added AnalyzerUtil.getPorterStemmerAnalyzer() for English
language stemming on full text search.
• Added XOMUtil.toDocument(String xml) convenience method to
parse a string.
• Moved XOMUtil.toByteArray() and XOMUtil.toString() into
class FileUtil. The old methods remain available but have been
• Added "jar-bnux" ant target to optionally build a minimal
jar file (20 KB) for binary XML only.
• Added more test documents to samples/data directory.
• Updated license blurbs to 2005.
More information about the talk