- Paul Bone, Zoltan Somogyi and Peter Schachte
- Declarative Aspects and Applications of Multicore Programming
- Philadelphia, PA, USA
- January, 2012
Recently we built a system that uses profiling data to automatically parallelize Mercury programs by finding conjunctions with expensive conjuncts that can run in parallel with minimal synchronization delays. This worked very well in many cases, but in cases of tail recursion, we got much lower speedups than we expected, due to excessive memory usage. In this paper, we present a novel program transformation that eliminates this problem, and also allows recursive calls inside parallel conjunctions to take advantage of tail recursion optimization. Our benchmark results show that our new transformation greatly increases the speedups we can get from parallel Mercury programs; in one case, it changes no speedup into almost perfect speedup on four cores.