[xquery-talk] Doing some Pattern Frequency Distribution
martin at x-hive.com
Thu Jun 8 21:53:03 PDT 2006
> I have a requirement to get Count, Blank Count, Max Length, Frequency
> Distribution and Pattern Frequency Distribution on some of the
> in an XML which can go up to a size of 5GB.
I would recommend to go with XQuery and a real XML database for those
sizes - I'm not much of an expert for XSLT processors, but I doubt
you'll get good results with XSLT on datasets of 5 GB. XML databases
are typically capable of holding less than the full XML tree in
memory, which makes many operations on huge files possible.
> With my initial reading on
> XSLT and XQuery I felt XQuery is a best candidate for this. As you
> suggested using XSLT for "Pattern Frequency Distribution (PFD)" I need
> to change the whole solution from XQuery to XSLT.
See my email about solving that in XQuery.
More information about the talk