[xquery-talk] Doing some Pattern Frequency Distribution

andrew welch andrew.j.welch at gmail.com
Wed Jun 7 22:03:01 PDT 2006


On 6/7/06, Kusunam, Srinivas <SKusunam at rlpt.com> wrote:
> I have a requirement to find out the pattern frequency distribution of
> some of the elements say Phone number.
>
> Here is the example
>
> <DOC>
>         <ELEMENT>
>                 <PHONE>123-456-7890 </PHONE>
>         </ELEMENT>
>         <ELEMENT>
>                 <PHONE>123-456-7899 </PHONE>
>         </ELEMENT>
>         <ELEMENT>
>                 <PHONE>123.456.7890 </PHONE>
>         </ELEMENT>
>         <ELEMENT>
>                 <PHONE>(123)456-7890 </PHONE>
>         </ELEMENT>
> </DOC>
>
> Output should be something like this:
>         Pattern: 999-999-9999   count:2
>         Pattern: 999.999.9999   count:1
>         Pattern: (999)999-9999  count:1
>
> Is it possible to achieve this using XQuery? If yes how do we do this?
> Any pointers or suggestions are welcome.

This can be achieved in XQuery, but the grouping facility of XSLT 2.0
makes it easier:

<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:variable name="zeroToNine" select="'0123456789'"/>

<xsl:template match="/">
 <xsl:for-each-group select="//PHONE"
                                  group-by="translate(., $zeroToNine,
'9999999999')">
 <xsl:value-of select="concat('Pattern: ',
                                         current-grouping-key(), ' count: ',
                                         count(current-group()), '&#xa;' )"/>
       </xsl:for-each-group>
</xsl:template>

</xsl:stylesheet>

cheers
andrew


More information about the talk mailing list