Personal tools
You are here: Home Research Systems Open MPI
Document Actions

Open MPI

by admin last modified 2004-10-15 12:14
OpenMPI is a community version of MPI. Each of the core contributors is the developer of an existing production-quality implementation of the Message Passing Interface (MPI) standard—FT-MPI (UTK), LA-MPI (LANL) and LAM/MPI (IU)—which offer various approaches to data and process fault tolerance in addition to high-performance communication.  The OpenMPI project is developing a highly configurable and extensible runtime environment— or middleware—to support robust parallel computation on systems ranging from small mission-critical and embedded systems to future petascale supercomputers.  OpenMPI has a light-weight component architecture that allows for on the fly loading of component modules and run-time selection of features (including network device, OS, and resource management support), enabling the middleware to be highly adaptable, both statically to accommodate a wide variety of system types, and dynamically in response to rapidly changing heterogeneous environments. The architecture also provides an ideal framework for adding support for experimental or innovative devices.  Project OpenMPI's initial goal is to provide a framework for a new high-quality implementation of MPI Version 2 with high levels of communication performance, scalability to hundreds of thousands of processes, and data and process fault tolerance. The first release of open-MPI is scheduled for November 2004. OpenMPI was designed, however, to be the foundation of more complete runtime environment than a simple message-passing library.  A central goal of OpenMPI is to enable effective fault management (an essential requirement for scalable computers). Middleware such as OpenMPI is uniquely positioned to coordinate and broker the tasks of fault prediction, detection, recovery and reconfiguration. We do not propose to provide a fully automatic or “canned” solution to fault management, but rather to provide a consistent and common APIs so that applications can discover, characterize, and respond appropriately to faults.

The low-level communication layer of OpenMPI is designed with high-performance in mind, providing low latency, and scalable high bandwidth through the striping of message fragments across multiple network devices, with optional end-to-end data integrity through a lightweight checksum/retransmission protocol. The design is structured in such a way that all or part of the communication protocol may be offloaded to network-device processors on architectures where this is beneficial.  Finally, OpenMPI is highly portable, conforming to ISO C and POSIX standards throughout.  This enables us to target a variety of operating systems, including novel choices such as Plan 9 and realtime operating systems (RTOSs).

« September 2010 »
Su Mo Tu We Th Fr Sa

Powered by Plone

LACSI Collaborators include: