[xquery-talk] sorting problem

G. Ken Holman gkholman at CraneSoftwrights.com
Tue May 24 09:29:02 PDT 2011


At 2011-05-24 15:57 +0200, Torsten Schassan wrote:
>Hi
>...
>considered strings like these:
>
>11 Aug. 2°
>10 Aug. 4°
>3.1.1 Aug. 2°
>A Aug. 2°
>1 Weiss.
>
>
>How would I sort them using XQuery according to this succession
>
>- letters in the middle first (Aug etc)
>- the sizes (2° etc)
>- "front" letters before numbers
>- numbers according to their position before the dots
>
>...
>
>Sorting result for the examples above would then be:
>
>A Aug. 2°
>3.1.1 Aug. 2°
>11 Aug. 2°
>10 Aug. 4°
>1 Weiss.
>
>
>
>Would that be done best with tokenize() or matches()?

Between the two, since matches() returns only 
true/false, only tokenize() would return 
meaningful values, but I don't think that will help you.

The only way I could do it was rearranging the 
content in the order desired by using replace(), 
and then taking advantage of a Saxon-specific 
collation order "alphanumeric" to address the 
field with the numbers and dots.  I think it 
would take quite a bit of code to address the 
alphanumeric collation requirement other than with a custom collation.

I hope the running example below helps as it matches your requirement.

. . . . . . . . . . Ken

~/t/ftemp $ sh t.sh
+ cat torsten.xquery
declare variable $s := (
'11 Aug. 2°',
'10 Aug. 4°',
'3.1.1 Aug. 2°',
'A Aug. 2°',
'1 Weiss.' );

"
===Input:
",
for $each in $s
    return ( $each,"
" )
,"===Result:
",
for $each in $s
   order by replace( $each, 
"(([A-Za-z]+)|([0123456789.]+))\s+(\S+)\s*([^\s]*)",
                             "1 $4 $5 $2ZZZ$3" )
   collation "http://saxon.sf.net/collation?alphanumeric=yes"
   return ( $each, "
" )


+ xquery torsten.xquery
<?xml version="1.0" encoding="UTF-8"?>
===Input:
  11 Aug. 2°
  10 Aug. 4°
  3.1.1 Aug. 2°
  A Aug. 2°
  1 Weiss.
  ===Result:
  A Aug. 2°
  3.1.1 Aug. 2°
  11 Aug. 2°
  10 Aug. 4°
  1 Weiss.
+ cat torsten-explain.xquery
declare variable $s := (
'11 Aug. 2°',
'10 Aug. 4°',
'3.1.1 Aug. 2°',
'A Aug. 2°',
'1 Weiss.' );

"
===Input:
",
for $each in $s
    return ( $each,"
" )
,"===Parts:
",
let $srq :=
for $each in $s
     return replace( $each, 
"(([A-Za-z]+)|([0123456789.]+))\s+(\S+)\s*([^\s]*)",
                             "1 '$4' '$5' '$2'ZZZ'$3'" )
let $sr :=
for $each in $s
     return replace( $each, 
"(([A-Za-z]+)|([0123456789.]+))\s+(\S+)\s*([^\s]*)",
                             "1 $4 $5 $2ZZZ$3" )
return (
for $each in $srq return ( $each, "
" ),'==(no quotes)
',
for $each in $sr return ( $each, "
" )
,"===Parts ordered:
",
for $each in $sr order by $each return ( $each, "
" )
,"===Parts ordered alphanumeric:
",
for $each in $sr order by $each
     collation "http://saxon.sf.net/collation?alphanumeric=yes"
     return ( $each, "
" )
)
,"===All you need for your result:
",
for $each in $s
   order by replace( $each, 
"(([A-Za-z]+)|([0123456789.]+))\s+(\S+)\s*([^\s]*)",
                             "1 $4 $5 $2ZZZ$3" )
   collation "http://saxon.sf.net/collation?alphanumeric=yes"
   return ( $each, "
" )


+ xquery torsten-explain.xquery
<?xml version="1.0" encoding="UTF-8"?>
===Input:
  11 Aug. 2°
  10 Aug. 4°
  3.1.1 Aug. 2°
  A Aug. 2°
  1 Weiss.
  ===Parts:
  1 'Aug.' '2°' ''ZZZ'11'
  1 'Aug.' '4°' ''ZZZ'10'
  1 'Aug.' '2°' ''ZZZ'3.1.1'
  1 'Aug.' '2°' 'A'ZZZ''
  1 'Weiss.' '' ''ZZZ'1'
  ==(no quotes)
  1 Aug. 2° ZZZ11
  1 Aug. 4° ZZZ10
  1 Aug. 2° ZZZ3.1.1
  1 Aug. 2° AZZZ
  1 Weiss.  ZZZ1
  ===Parts ordered:
  1 Aug. 2° AZZZ
  1 Aug. 2° ZZZ11
  1 Aug. 2° ZZZ3.1.1
  1 Aug. 4° ZZZ10
  1 Weiss.  ZZZ1
  ===Parts ordered alphanumeric:
  1 Aug. 2° AZZZ
  1 Aug. 2° ZZZ3.1.1
  1 Aug. 2° ZZZ11
  1 Aug. 4° ZZZ10
  1 Weiss.  ZZZ1
  ===All you need for your result:
  A Aug. 2°
  3.1.1 Aug. 2°
  11 Aug. 2°
  10 Aug. 4°
  1 Weiss.
~/t/ftemp $

--
Contact us for world-wide XML consulting & instructor-led training
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/q/
G. Ken Holman                 mailto:gkholman at CraneSoftwrights.com
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal




More information about the talk mailing list