Daniel Chavarria-Miranda and John Mellor-Crummey (2002)
An Evaluation of Data-Parallel Compiler Support for Line-Sweep Applications
In: Proceedings of 11th International Conference on Parallel Architectures and Compilation Techniques (PACT), ed. by Erik Altman and Sally McKee, Charlottesville, VA, IEEE Computer Society.
Data parallel compilers have long aimed to equal the performance of carefully and-optimized parallel codes. For tightly-coupled applications based on line sweeps, this goal has been particularly elusive. In the Rice dHPF compiler, we have developed a wide spectrum of optimizations that enable us to closely approach hand-coded performance for tightly-coupled line sweep applications including the NAS SP and BT benchmark codes. From lightly-modified copies of standard serial versions of these benchmarks, dHPF generates MPI-based parallel code that is within 4\\% of the performance of the hand-crafted MPI implementations of these codes for a 1023 problem size (Class B) on 64 processors.We describe and quantitatively evaluate the impact of partitioning, communication and memory hierarchy optimizations implemented by dHPF that enable us to approach hand-coded performance with compiler-generated parallel code.