I had a weird experience with #Julia and #parallelization.
I had to collect all the pairs of items in two sets (S and G) that satisfied a particular relation function R. For Reasons™, it's easier to find these pairs looking at each element of S and find which elements of G are in relation with it, but this information is better used in reverse (i.e. on use we want, for any given element of G, find which elements of S are in relation with it.
So my idea was to build a set of tuples (i, j, r) where i is the index of an element of S, j the index of an element of G, and r the result of applying R to these two elements (which gives a value I need “on use” of this information I'm collecting).
To build this set of tuples, I went with (step A) a function mapping each index i to the list of tuples where the relation was possible, and then (step B) merge the lists together into a single list L.
Then (step C) for each index j of G, the approach was to find_all elements in L in which the second index was j, and thus have for each element of G the elements of S it was in relation with, including the relation function value, just as I wanted.
My original implementation was SLOW. Step A would take some 20 seconds, which I was able to bring down to 10 with optimizations, but step C, which would take between 30 and 130 second *using parallelization*.
The worst of it? I could trivially parallelize step A, but in this case step C would take FOREVER (stopped waiting after 15 minutes).
1/n



