[xquery-talk] Purely recursive deduplication of sequence?

Christian Grün christian.gruen at gmail.com
Sat Aug 1 06:21:10 PDT 2020


One more solution with fold-left (it’s faster than the recursive approach):

declare function local:distinct-items($items as item()*) as item()* {
  fold-left($items, (), function($result, $item) {
    $result, $item[empty($result[deep-equal(., $item)])]
  })
};
local:distinct-items(('x', (1 to 1000000) ! 1, 'x'))



On 8/1/20, Christian Grün <christian.gruen at gmail.com> wrote:
> Hi Andreas,
>
> Try this:
>
> declare function local:distinct-items($items as item()*) as item()* {
>   let $h := head($items)
>   where exists($h)
>   let $t := tail($items)[not(deep-equal(., $h))]
>   return ($h, local:distinct-items($t))
> };
>
> If the FLWOR expression is avoided, it may increase the runtime:
>
> declare function local:distinct-items($items as item()*) as item()* {
>   if (empty($items)) then () else (
>     head($items),
>     local:distinct-items(tail($items)[not(deep-equal(., head($items)))])
>   )
> };
>
> Hope this helps,
> Christian
>
>
>
> On 8/1/20, Andreas Mixich <mixich.andreas at gmail.com> wrote:
>> Hello,
>>
>> I wonder, whether it could be possible to deduplicate a sequence by a
>> purely recursive approach,
>> without creating state in any form.
>>
>> What I have, so far, is only capable of removing consecutive dupes, so,
>> nothing fancy.
>>
>>   (:~
>>     Removes all consecutive duplicate items from a sequence, returning
>>     the deduped sequence.
>>
>>     @param   $items  a sequence, that may contain duplicates
>>     @return  a sequence, in which all 'pairs' have been 'singled'
>>     :)
>>   declare function local:dedupe-pairs($items as item()*) as item()* {
>>     if (count($items) <= 1)
>>     then $items
>>     else if (deep-equal(head($items), head(tail($items))))
>>          then local:dedupe-pairs(tail($items))
>>          else (
>>                 head($items)
>>               , local:dedupe-pairs(tail($items))
>>               )
>>   };
>>
>>   (: following sequence has 19 items :)
>>   let $a3 := (
>>                array {"One", "Two", 3}
>>              , array {"One", "Two", 1}
>>              , array {"One", "Two", 3}
>>              , <xml></xml>
>>              , <xml></xml>
>>              , array {"apples","oranges"}
>>              , array {"apples","oranges"}
>>              , <xml></xml>
>>              , array {"One",5,<xml></xml>}
>>              , array {3,7,"Hello World!"}
>>              , array {3,7,"Hello World!"}
>>              , map{'key':'value'}
>>              , false()
>>              , false()
>>              , array {"One","lorem",<xml></xml>,2,"Naomi"}
>>              , array {"One", "Two", 3}
>>              , array {"One","lorem",<xml></xml>,2,"Naomi"}
>>              , array {<xml></xml>,map{'key':'value'},true()}
>>              , false()
>>              )
>>   return local:dedupe-pairs($a3)
>>
>> serializes to:
>>
>>   ["One","Two",3]
>>   ["One","Two",1]
>>   ["One","Two",3]
>>   <xml/>
>>   ["apples","oranges"]
>>   <xml/>
>>   ["One",5,<xml/>]
>>   [3,7,"Hello World!"]
>>   map{"key":"value"}
>>   false
>>   ["One","lorem",<xml/>,2,"Naomi"]
>>   ["One","Two",3]
>>   ["One","lorem",<xml/>,2,"Naomi"]
>>   [<xml/>,map{"key":"value"},true()]
>>   false
>>
>> However, this is not what I want. I want to understand, whether a
>> function would be possible, that completely dedupes given sequence,
>> like `fn:distinct-values#1`, however, for any kind of item, not just
>> atomic ones:
>>
>>   `local:distinct-items($items as item()*) as item()*`
>>
>> It should no use state in any form (typically creating and revolving
>> a second list or sending a flag throughout recursion).
>>
>> Also, I do not want to use any manipulative functions (like fn:remove#2)
>> and, ideally, no `FOR..IN..` construct.
>>
>> The deduped list would be purely the result of the recursive walk through
>> the list, grown organically, so to say.
>>
>> Only, I can not come up with such a thing. Is there a way to do it?
>>
>> Thank you.
>>
>> --
>> Goody Bye, Minden jót, Mit freundlichen Grüßen,
>> Andreas Mixich
>> _______________________________________________
>> talk at x-query.com
>> http://x-query.com/mailman/listinfo/talk
>



More information about the talk mailing list