[xquery-talk] Purely recursive deduplication of sequence?
Christian Grün
christian.gruen at gmail.com
Sat Aug 1 06:21:10 PDT 2020
One more solution with fold-left (it’s faster than the recursive approach):
declare function local:distinct-items($items as item()*) as item()* {
fold-left($items, (), function($result, $item) {
$result, $item[empty($result[deep-equal(., $item)])]
})
};
local:distinct-items(('x', (1 to 1000000) ! 1, 'x'))
On 8/1/20, Christian Grün <christian.gruen at gmail.com> wrote:
> Hi Andreas,
>
> Try this:
>
> declare function local:distinct-items($items as item()*) as item()* {
> let $h := head($items)
> where exists($h)
> let $t := tail($items)[not(deep-equal(., $h))]
> return ($h, local:distinct-items($t))
> };
>
> If the FLWOR expression is avoided, it may increase the runtime:
>
> declare function local:distinct-items($items as item()*) as item()* {
> if (empty($items)) then () else (
> head($items),
> local:distinct-items(tail($items)[not(deep-equal(., head($items)))])
> )
> };
>
> Hope this helps,
> Christian
>
>
>
> On 8/1/20, Andreas Mixich <mixich.andreas at gmail.com> wrote:
>> Hello,
>>
>> I wonder, whether it could be possible to deduplicate a sequence by a
>> purely recursive approach,
>> without creating state in any form.
>>
>> What I have, so far, is only capable of removing consecutive dupes, so,
>> nothing fancy.
>>
>> (:~
>> Removes all consecutive duplicate items from a sequence, returning
>> the deduped sequence.
>>
>> @param $items a sequence, that may contain duplicates
>> @return a sequence, in which all 'pairs' have been 'singled'
>> :)
>> declare function local:dedupe-pairs($items as item()*) as item()* {
>> if (count($items) <= 1)
>> then $items
>> else if (deep-equal(head($items), head(tail($items))))
>> then local:dedupe-pairs(tail($items))
>> else (
>> head($items)
>> , local:dedupe-pairs(tail($items))
>> )
>> };
>>
>> (: following sequence has 19 items :)
>> let $a3 := (
>> array {"One", "Two", 3}
>> , array {"One", "Two", 1}
>> , array {"One", "Two", 3}
>> , <xml></xml>
>> , <xml></xml>
>> , array {"apples","oranges"}
>> , array {"apples","oranges"}
>> , <xml></xml>
>> , array {"One",5,<xml></xml>}
>> , array {3,7,"Hello World!"}
>> , array {3,7,"Hello World!"}
>> , map{'key':'value'}
>> , false()
>> , false()
>> , array {"One","lorem",<xml></xml>,2,"Naomi"}
>> , array {"One", "Two", 3}
>> , array {"One","lorem",<xml></xml>,2,"Naomi"}
>> , array {<xml></xml>,map{'key':'value'},true()}
>> , false()
>> )
>> return local:dedupe-pairs($a3)
>>
>> serializes to:
>>
>> ["One","Two",3]
>> ["One","Two",1]
>> ["One","Two",3]
>> <xml/>
>> ["apples","oranges"]
>> <xml/>
>> ["One",5,<xml/>]
>> [3,7,"Hello World!"]
>> map{"key":"value"}
>> false
>> ["One","lorem",<xml/>,2,"Naomi"]
>> ["One","Two",3]
>> ["One","lorem",<xml/>,2,"Naomi"]
>> [<xml/>,map{"key":"value"},true()]
>> false
>>
>> However, this is not what I want. I want to understand, whether a
>> function would be possible, that completely dedupes given sequence,
>> like `fn:distinct-values#1`, however, for any kind of item, not just
>> atomic ones:
>>
>> `local:distinct-items($items as item()*) as item()*`
>>
>> It should no use state in any form (typically creating and revolving
>> a second list or sending a flag throughout recursion).
>>
>> Also, I do not want to use any manipulative functions (like fn:remove#2)
>> and, ideally, no `FOR..IN..` construct.
>>
>> The deduped list would be purely the result of the recursive walk through
>> the list, grown organically, so to say.
>>
>> Only, I can not come up with such a thing. Is there a way to do it?
>>
>> Thank you.
>>
>> --
>> Goody Bye, Minden jót, Mit freundlichen Grüßen,
>> Andreas Mixich
>> _______________________________________________
>> talk at x-query.com
>> http://x-query.com/mailman/listinfo/talk
>
More information about the talk
mailing list