Oscar Hernandez, Chunhua Liao, and Barbara Chapman (2004)
A Tool to Display Array Access Patterns in OpenMP Programs
In: PARA'04 Workshop On State-Of-The-Art In Scientific Computing, Springer.
OpenMP is a de facto standard for shared memory programming that can be used to program SMPs and distributed shared memory systems. One way to improve OpenMP performance is to minimize the data sharing among threads, either by rearranging code or by privatizing data structures where possible. Unfortunately this is not an easy task. Prior to an effort of this kind, it is crucial to know how the arrays are being accessed by the executing threads, and detect which regions of arrayse are shared between multiple threads. The idea of this tool is to provide a visual environment to display the array access patterns detected at runtime. This information can help the programmer restructure their code to maximize data locality or to achieve OpenMP SPMD style code. Static array region analysis is necessary to summarize individual array accesses and thereby avoid costly runtime overheads (which would otherwise be required to record each access to an individual array element). This tool uses static symbolic analysis and array dataflow analysis within a compiler to instrument the program strategically at the array region level in order to gather information at run time. Each time an array region is accessed at runtime, the tool records the information necessary to instantiate that region (which may have been described in symbolic terms) and will record which thread accesses it. After running the application, the tool will show the different array dataflow graphs and regions. These will be combined with the dynamic callgraph and flowgraph of the application to provide different views of array region summaries at the procedure, basic block, statement and thread level. This paper will describe how the tool is implemented based upon the Open64 compiler, an open source research compiler for C/C++/F90 and OpenMP. The paper will also discuss how this kind of analysis can improve the performance of existing OpenMP programs, as well as the overheads of our instrumentation.