Might as well post this here: I co-authored a paper for the #OpenModelica conference this year. Pertains to co-simulation, that is how you can split numerical simulations across multiple processes either to get speed-up or to combine different integrators (Dymola + Simulink for instance). TL;DR: almost everyone is doing it wrong, we're working on libre software to fix the situation

https://social.umeahackerspace.se/url/177299
From skimming the abstract; the latency seems to be problematic. I suggest infiniband for rdma and sub usec latency.
We have CPU hours to use on Abisko, which uses Infiniband. But the point is perhaps: so long as each node does non-trivial amounts of computation then the performance hit from the latency is more than made up from being able to run on multiple nodes. The challenge lies in how you split your simulation, so you're reasonably sure you're not getting nonsense