D. P Spooner, S. A Jarvis, G. R Nudd, S. Saini, and D. J Kerbyson (2002)
Performance-based Workload Management for Grid Computing
In: Proceedings of the LACSI Symposium, Los Alamos, NM, Los Alamos Computer Institute.
The grid computing paradigm aims to provide a unified worldwide infrastructure for business and scientific computing, through the combination of commodity, specialised and high performance computing systems. However, the unpredictable and dynamic behaviours of Grid environments create di±culties when attempting to identify resources that can execute an application with a guaranteed level of service. In this work, we describe an architecture (Titan) that couples an established performance prediction tool with a resource-level and organisational-level workload management system. This approach provides an interface between user service requirements (such as deadline) and low-level system requirements (such as makespan) through the evaluation of performance models. The PACE (Performance Analysis and Characterisation Environment) system is used in this research to provide estimated execution times for scientific applications, through a layered architecture that models subtask, resource and parallel components. A key benefit of PACE is a °exible evaluation engine that can interrogate models onthe°y, allowing a workload management environment to predict the impact of allocating independent tasks to distributed, heterogeneous computing systems a-priori. By evaluating di®erent resource models and varying system parameters, it is possible to consider many performance scenarios and use the results as decisionmaking support for workload managers. This technique is employed by Titan where PACE analysis is