Apan Qasem and Ken Kennedy (2005)
A Cache-conscious Profitability Model for Empirical Tuning of Loop Fusion
In: Proceedings of the 18th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2005), Springer-Verlag, Lecture Notes in Computer Science.
Loop fusion is recognized as an effective program transformation for improving memory hierarchy performance. However, unconstrained loop fusion can lead to poor performance because of increased register pressure and cache conflict misses. The complex interaction between different levels of the memory hierarchy with the input program makes it very difficult to always make the right choice in fusing loops. In this paper, we present a cache-conscious analytical model for profitable loop fusion to be used with a constrained weighted fusion algorithm. We then extend the model to show its effectiveness in the context of an empirical tuning framework. A preliminary evaluation of the model is presented using hand experiments on four applications.