Document Actions

Applications and System Performance

by admin — last modified 2007-12-10 22:00

Building scientific applications that can effectively exploit extreme-scale parallel systems has proven incredibly difficult. The sheer level of parallelism in such systems poses a formidable challenge to achieving scalable performance. In addition, the architectural complexity of extreme-scale systems makes it hard to write programs that can fully exploit their capabilities. In today’s extreme-scale systems, complex processors, deep memory hierarchies and heterogeneous interconnects require careful scheduling of an application’s operations, data accesses and communication to enable the application to achieve a significant fraction of a system’s potential performance. Furthermore, the large number of components in extreme-scale parallel systems makes component failure inevitable; therefore, long-running applications must be resilient to hardware faults or risk being unable to run to completion.

The principal goals of the application and system performance research thrust are

understanding application and system performance on present-day extreme-scale architectures through the development and application of technologies for measurement and modeling of program and system behavior,
devising software strategies to ameliorate application performance bottlenecks on today’s architectures,
modeling the behavior of applications to understand factors affecting their scalability on future generations of extreme-scale systems, and
investigating software technology that will enable higher performance on next-generation, extreme-scale parallel systems.

A broad spectrum of issues affects application performance, including operating system activity, load imbalance, serialization, underutilization of processor functional units, data copying, poor temporal and spatial locality of data accesses, exposed communication latency, high communication frequency and large communication bandwidth requirements. A quantitative assessment of factors limiting application performance on current-generation architectures will help focus long-term research on software and hardware technologies that hold the most promise for improving application performance and scalability on future systems. A multitude of challenging problems must be solved to understand how to best implement scientific applications so that they can achieve scalable high performance on extreme-scale parallel systems.

As part of this research thrust, the project team will explore application performance on many fronts and undertake a program of research that aims to develop technologies to support measuring, modeling, understanding, tuning and steering application performance on current and future generations of extreme-scale parallel architectures. This work will address all aspects of performance and reliability spanning system architecture, network and applications. Our investigation will include work on both scalability and node performance. The findings from this research, as well as tools and software infrastructure developed as products of this effort, are expected to benefit all ASC application teams by providing them with more efficient programming models, technology for compiler-assisted tuning of applications, better performance instrumentation and diagnostic capabilities, insight into the performance and scaling of applications and systems through modeling, improved algorithm-architecture mapping, and better performing extreme-scale parallel architectures.

LACSI at Rice University

Sections

Personal tools

Document Actions

Applications and System Performance