Yuan Zhao and Ken Kennedy (2006)
Compiling for Increasing On-chip Parallelism
In: Workshop on Tools and Compilers for Hardware Acceleration (TCHA) in conjunction with the Fifteenth International Conference on Parallel Architectures and Compilation Techniques (PACT 2006), ACM Press.
It becomes a trend that microprocessor companies are adding more and more parallelism on a chip to increase performance per chip. At the fine granularity level, vector instruction sets are added. While at the coarse granularity level, multiple cores are put on the same chip. This trend presents a challenge for application developers as well for compiler developers: how to exploit the power of these introduced parallelism? In this paper, we present a source-to-source compiler that automatically compiles programs written by ordinary users targeting the on-chip parallelism without users specifying parallelism directives. Initially developed for short vector processors, this compiler is extended to support a heterogeneous multi-core CELL processor. Besides parallelism, these processors also introduced various memory constraints such as data alignment and data movement that will affect an application’s performance. Thus we will discuss our compiler strategies for these issues as well.