[xquery-talk] outer join between 2 sequences

Ihe Onwuka ihe.onwuka at gmail.com
Sun Sep 28 08:51:04 PDT 2014


On Sun, Sep 28, 2014 at 2:13 PM, Adam Retter <adam.retter at googlemail.com>
wrote:

> So after a bit more coffee and a bit of research it seems to me that
> the only way you are going to get this to be fast would be if you used
> a hash based looked for one of your sequences. Something like a
> HashMap or BloomFilter would do the job, see:
>
> http://stackoverflow.com/questions/4261619/fastest-set-operations-in-the-west
>
>
Hmmmmm well I have already committed myself before your retraction and the
thing is running now so am just going to leave it.

This is only the tip of the problem because each integer that survives
selection generates an HTTP request that either returns a 404 or leads to
an HTTP Put into eXist. So if I was really in a hurry I'd have to start
looking at mapReducing the keys. I'm not even sure that is the answer as
 all the mapreduce jobs would be pounding the same server (mind you the
site may load balance to mitigate that).

Another alternative solution is to just export sequence B into SQLite,
index both sequences (now they are tables)  and do it in SQL.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://x-query.com/pipermail/talk/attachments/20140928/96028f8c/attachment.html>


More information about the talk mailing list