<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns="http://www.w3.org/TR/REC-html40">

<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 11 (filtered medium)">
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style>
<![endif]-->
<title>RE: [xquery-talk] Count of Distinct elements performance problem</title>
<style>
<!--
 /* Font Definitions */
 @font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman";}
a:link, span.MsoHyperlink
        {color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {color:blue;
        text-decoration:underline;}
p
        {mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:12.0pt;
        font-family:"Times New Roman";}
pre
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:10.0pt;
        font-family:"Courier New";}
span.EmailStyle19
        {mso-style-type:personal-reply;
        font-family:Arial;
        color:navy;}
@page Section1
        {size:8.5in 11.0in;
        margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
        {page:Section1;}
-->
</style>

</head>

<body lang=EN-US link=blue vlink=blue>

<div class=Section1>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>Sorry I forgot to mention the environment.<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>XQuery Engine:&nbsp;&nbsp; Data Direct<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>OS:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;  Linux 32 bit<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>JVM:&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;1.5.0<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>Are there any other products in the market
which can handle this load? <o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>I tried with Saxon and it goes out of
memory for even 1GB file on 32 bit machine (1.8GB memory limitation) which does
make sense as you said previously Saxon needs a memory at least 3 times the
size of input XML.<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>Thanks,<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>Srinivas<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p>&nbsp;</o:p></span></font></p>

<div>

<div class=MsoNormal align=center style='text-align:center'><font size=3
face="Times New Roman"><span style='font-size:12.0pt'>

<hr size=2 width="100%" align=center tabindex=-1>

</span></font></div>

<p class=MsoNormal><b><font size=2 face=Tahoma><span style='font-size:10.0pt;
font-family:Tahoma;font-weight:bold'>From:</span></font></b><font size=2
face=Tahoma><span style='font-size:10.0pt;font-family:Tahoma'> Michael Kay
[mailto:mhk@mhk.me.uk] <br>
<b><span style='font-weight:bold'>Sent:</span></b> Wednesday, August 23, 2006
3:55 PM<br>
<b><span style='font-weight:bold'>To:</span></b> Kusunam, Srinivas;
talk@xquery.com<br>
<b><span style='font-weight:bold'>Subject:</span></b> RE: [xquery-talk] Count
of Distinct elements performance problem</span></font><o:p></o:p></p>

</div>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><o:p>&nbsp;</o:p></span></font></p>

<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:blue'>Did you tell us what product you are
using?</span></font><o:p></o:p></p>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'>&nbsp;<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:blue'>As I mentioned, the recursive code depends
heavily on optimization in the processor - as indeed does everything, with the
kind of data volumes you are dealing with.</span></font><o:p></o:p></p>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'>&nbsp;<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:blue'>Michael Kay</span></font><o:p></o:p></p>

<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:blue'>http://www.saxonica.com/</span></font><o:p></o:p></p>

</div>

</body>

</html>

<pre>*****************************************************************
This message has originated from RLPTechnologies,
26955 Northwestern Highway, Southfield, MI 48033.

RLPTechnologies sends various types of email
communications.  If this email message concerns the
potential licensing of an RLPT product or service, and
you do not wish to receive further emails regarding Polk
products, forward this email to Do_Not_Send@rlpt.com
with the word "remove" in the subject line.

The email and any files transmitted with it are confidential
and intended solely for the individual or entity to whom they
are addressed.

If you have received this email in error, please delete this
message and notify the Polk System Administrator at
postmaster@rlpt.com.
*****************************************************************