Keith D Cooper, Timothy J Harvey, and Todd Waterman (2002)
Building a Control-flow Graph from Scheduled Assembly Code
Rice University, Department of Computer Science, 6100 Main Street, Houston, TX 77005.
A variety of applications have arisen where it is worthwhile to apply code optimizations directly to the machine code (or assembly code) produced by a compiler. These include link-time whole-program analysis and optimization, code compression, binaryto- binary translation, and bit-transition reduction (for power). Many, if not most, optimizations assume the presence of a control-flow graph (cfg). Compiled, scheduled code has properties that can make cfg construction more complex than it is inside a typical compiler. In this paper, we examine the problems of scheduled code on architectures that have multiple delay slots. In particular, if branch delay slots contain other branches, the classic algorithms for building a cfg produce incorrect results.
We explain the problem using two simple examples. We then present an algorithm for building correct cfgs from scheduled assembly code that includes branches in branch-delay slots. The algorithm works by building an approximate cfg and then refining it to reflect the actions of delayed branches. If all branches have explicit targets, the complexity of the refining step is linear with respect to the number of branches in the code.
Analysis of the kind presented in this paper is a necessary first step for any system that analyzes or translates compiled, assembly-level code.
We have implemented this algorithm in our power-consumption experiments based on the TMS320C6200 architecture from Texas Instruments. The development of our algorithm was motivated by the output of TI’s compiler.