Daniel Chavarria-Miranda and John Mellor-Crummey (2005)
Effective communication coalescing for data-parallel applications
In: Principles and Practice of Parallel Programming, pp. 14 - 25, New York, NY, USA, ACM Press.
Communication coalescing is a static optimization that can reduce both communication frequency and redundant data transfer in compiler-generated code for regular, data parallel applications. We present an algorithm for coalescing communication that arises when generating code for regular, data-parallel applications written in High Performance Fortran (HPF). To handle sophisticated computation partitionings, our algorithm normalizes communication before attempting coalescing. We experimentally evaluate our algorithm, which is implemented in the dHPF compiler, in the compilation of HPF versions of the NAS application benchmarks SP, BT and LU. Our normalized coalescing algorithm improves the performance and scalability of compiler-generated code for these benchmarks by reducing the communication volume up to 55\% compared to a simpler coalescing strategy and enables us to match the communication volume and frequency in hand-optimized MPI implementations of these codes.