[xquery-talk] sorting problem
G. Ken Holman
gkholman at CraneSoftwrights.com
Tue May 24 09:29:02 PDT 2011
At 2011-05-24 15:57 +0200, Torsten Schassan wrote:
>Hi
>...
>considered strings like these:
>
>11 Aug. 2°
>10 Aug. 4°
>3.1.1 Aug. 2°
>A Aug. 2°
>1 Weiss.
>
>
>How would I sort them using XQuery according to this succession
>
>- letters in the middle first (Aug etc)
>- the sizes (2° etc)
>- "front" letters before numbers
>- numbers according to their position before the dots
>
>...
>
>Sorting result for the examples above would then be:
>
>A Aug. 2°
>3.1.1 Aug. 2°
>11 Aug. 2°
>10 Aug. 4°
>1 Weiss.
>
>
>
>Would that be done best with tokenize() or matches()?
Between the two, since matches() returns only
true/false, only tokenize() would return
meaningful values, but I don't think that will help you.
The only way I could do it was rearranging the
content in the order desired by using replace(),
and then taking advantage of a Saxon-specific
collation order "alphanumeric" to address the
field with the numbers and dots. I think it
would take quite a bit of code to address the
alphanumeric collation requirement other than with a custom collation.
I hope the running example below helps as it matches your requirement.
. . . . . . . . . . Ken
~/t/ftemp $ sh t.sh
+ cat torsten.xquery
declare variable $s := (
'11 Aug. 2°',
'10 Aug. 4°',
'3.1.1 Aug. 2°',
'A Aug. 2°',
'1 Weiss.' );
"
===Input:
",
for $each in $s
return ( $each,"
" )
,"===Result:
",
for $each in $s
order by replace( $each,
"(([A-Za-z]+)|([0123456789.]+))\s+(\S+)\s*([^\s]*)",
"1 $4 $5 $2ZZZ$3" )
collation "http://saxon.sf.net/collation?alphanumeric=yes"
return ( $each, "
" )
+ xquery torsten.xquery
<?xml version="1.0" encoding="UTF-8"?>
===Input:
11 Aug. 2°
10 Aug. 4°
3.1.1 Aug. 2°
A Aug. 2°
1 Weiss.
===Result:
A Aug. 2°
3.1.1 Aug. 2°
11 Aug. 2°
10 Aug. 4°
1 Weiss.
+ cat torsten-explain.xquery
declare variable $s := (
'11 Aug. 2°',
'10 Aug. 4°',
'3.1.1 Aug. 2°',
'A Aug. 2°',
'1 Weiss.' );
"
===Input:
",
for $each in $s
return ( $each,"
" )
,"===Parts:
",
let $srq :=
for $each in $s
return replace( $each,
"(([A-Za-z]+)|([0123456789.]+))\s+(\S+)\s*([^\s]*)",
"1 '$4' '$5' '$2'ZZZ'$3'" )
let $sr :=
for $each in $s
return replace( $each,
"(([A-Za-z]+)|([0123456789.]+))\s+(\S+)\s*([^\s]*)",
"1 $4 $5 $2ZZZ$3" )
return (
for $each in $srq return ( $each, "
" ),'==(no quotes)
',
for $each in $sr return ( $each, "
" )
,"===Parts ordered:
",
for $each in $sr order by $each return ( $each, "
" )
,"===Parts ordered alphanumeric:
",
for $each in $sr order by $each
collation "http://saxon.sf.net/collation?alphanumeric=yes"
return ( $each, "
" )
)
,"===All you need for your result:
",
for $each in $s
order by replace( $each,
"(([A-Za-z]+)|([0123456789.]+))\s+(\S+)\s*([^\s]*)",
"1 $4 $5 $2ZZZ$3" )
collation "http://saxon.sf.net/collation?alphanumeric=yes"
return ( $each, "
" )
+ xquery torsten-explain.xquery
<?xml version="1.0" encoding="UTF-8"?>
===Input:
11 Aug. 2°
10 Aug. 4°
3.1.1 Aug. 2°
A Aug. 2°
1 Weiss.
===Parts:
1 'Aug.' '2°' ''ZZZ'11'
1 'Aug.' '4°' ''ZZZ'10'
1 'Aug.' '2°' ''ZZZ'3.1.1'
1 'Aug.' '2°' 'A'ZZZ''
1 'Weiss.' '' ''ZZZ'1'
==(no quotes)
1 Aug. 2° ZZZ11
1 Aug. 4° ZZZ10
1 Aug. 2° ZZZ3.1.1
1 Aug. 2° AZZZ
1 Weiss. ZZZ1
===Parts ordered:
1 Aug. 2° AZZZ
1 Aug. 2° ZZZ11
1 Aug. 2° ZZZ3.1.1
1 Aug. 4° ZZZ10
1 Weiss. ZZZ1
===Parts ordered alphanumeric:
1 Aug. 2° AZZZ
1 Aug. 2° ZZZ3.1.1
1 Aug. 2° ZZZ11
1 Aug. 4° ZZZ10
1 Weiss. ZZZ1
===All you need for your result:
A Aug. 2°
3.1.1 Aug. 2°
11 Aug. 2°
10 Aug. 4°
1 Weiss.
~/t/ftemp $
--
Contact us for world-wide XML consulting & instructor-led training
Crane Softwrights Ltd. http://www.CraneSoftwrights.com/q/
G. Ken Holman mailto:gkholman at CraneSoftwrights.com
Legal business disclaimers: http://www.CraneSoftwrights.com/legal
More information about the talk
mailing list