Nathan Froyd, Robert Fowler, and John Mellor-Crummey (2004)
A Sample-driven Call Stack Profiler
Department of Computer Science, Rice University, 6100 Main Street, Houston, TX 77005.
Call graph profiling reports measurements of resource utilization along with information about the calling context in which the resources were consumed. We present the design of a novel profiler that measures resource utilization and its associated calling context using a stack sampling technique. Our scheme has a novel combination of features and mechanisms. First, it requires no compiler support or instrumentation, either of source or binary code. Second, it works on heavily optimized code and on complex, multi-module applications. Third, it uses sampling rather than tracing to build a context tree, collect histogram data, and to characterize calling patterns. Fourth, the data structures and algorithms are efficient enough to construct the complete tree exposed in the sampling process. We describe an implementation for the Alpha/Tru64 platform and present experimental measurements that compare this implementation with the standard call graph profiler provided on Tru64, hiprof. We show results from a variety of programs in several languages indicating that our profiler operates with modest overhead. Our experiments show that the profiling overhead of our technique is nearly a factor of 55 lower than that of hiprof when profiling a call-intensive recursive program.