- •Contents
- •List of Figures
- •List of Tables
- •Acknowledgments
- •Introduction to MPI
- •Overview and Goals
- •Background of MPI-1.0
- •Background of MPI-1.1, MPI-1.2, and MPI-2.0
- •Background of MPI-1.3 and MPI-2.1
- •Background of MPI-2.2
- •Who Should Use This Standard?
- •What Platforms Are Targets For Implementation?
- •What Is Included In The Standard?
- •What Is Not Included In The Standard?
- •Organization of this Document
- •MPI Terms and Conventions
- •Document Notation
- •Naming Conventions
- •Semantic Terms
- •Data Types
- •Opaque Objects
- •Array Arguments
- •State
- •Named Constants
- •Choice
- •Addresses
- •Language Binding
- •Deprecated Names and Functions
- •Fortran Binding Issues
- •C Binding Issues
- •C++ Binding Issues
- •Functions and Macros
- •Processes
- •Error Handling
- •Implementation Issues
- •Independence of Basic Runtime Routines
- •Interaction with Signals
- •Examples
- •Point-to-Point Communication
- •Introduction
- •Blocking Send and Receive Operations
- •Blocking Send
- •Message Data
- •Message Envelope
- •Blocking Receive
- •Return Status
- •Passing MPI_STATUS_IGNORE for Status
- •Data Type Matching and Data Conversion
- •Type Matching Rules
- •Type MPI_CHARACTER
- •Data Conversion
- •Communication Modes
- •Semantics of Point-to-Point Communication
- •Buffer Allocation and Usage
- •Nonblocking Communication
- •Communication Request Objects
- •Communication Initiation
- •Communication Completion
- •Semantics of Nonblocking Communications
- •Multiple Completions
- •Non-destructive Test of status
- •Probe and Cancel
- •Persistent Communication Requests
- •Send-Receive
- •Null Processes
- •Datatypes
- •Derived Datatypes
- •Type Constructors with Explicit Addresses
- •Datatype Constructors
- •Subarray Datatype Constructor
- •Distributed Array Datatype Constructor
- •Address and Size Functions
- •Lower-Bound and Upper-Bound Markers
- •Extent and Bounds of Datatypes
- •True Extent of Datatypes
- •Commit and Free
- •Duplicating a Datatype
- •Use of General Datatypes in Communication
- •Correct Use of Addresses
- •Decoding a Datatype
- •Examples
- •Pack and Unpack
- •Canonical MPI_PACK and MPI_UNPACK
- •Collective Communication
- •Introduction and Overview
- •Communicator Argument
- •Applying Collective Operations to Intercommunicators
- •Barrier Synchronization
- •Broadcast
- •Example using MPI_BCAST
- •Gather
- •Examples using MPI_GATHER, MPI_GATHERV
- •Scatter
- •Examples using MPI_SCATTER, MPI_SCATTERV
- •Example using MPI_ALLGATHER
- •All-to-All Scatter/Gather
- •Global Reduction Operations
- •Reduce
- •Signed Characters and Reductions
- •MINLOC and MAXLOC
- •All-Reduce
- •Process-local reduction
- •Reduce-Scatter
- •MPI_REDUCE_SCATTER_BLOCK
- •MPI_REDUCE_SCATTER
- •Scan
- •Inclusive Scan
- •Exclusive Scan
- •Example using MPI_SCAN
- •Correctness
- •Introduction
- •Features Needed to Support Libraries
- •MPI's Support for Libraries
- •Basic Concepts
- •Groups
- •Contexts
- •Intra-Communicators
- •Group Management
- •Group Accessors
- •Group Constructors
- •Group Destructors
- •Communicator Management
- •Communicator Accessors
- •Communicator Constructors
- •Communicator Destructors
- •Motivating Examples
- •Current Practice #1
- •Current Practice #2
- •(Approximate) Current Practice #3
- •Example #4
- •Library Example #1
- •Library Example #2
- •Inter-Communication
- •Inter-communicator Accessors
- •Inter-communicator Operations
- •Inter-Communication Examples
- •Caching
- •Functionality
- •Communicators
- •Windows
- •Datatypes
- •Error Class for Invalid Keyval
- •Attributes Example
- •Naming Objects
- •Formalizing the Loosely Synchronous Model
- •Basic Statements
- •Models of Execution
- •Static communicator allocation
- •Dynamic communicator allocation
- •The General case
- •Process Topologies
- •Introduction
- •Virtual Topologies
- •Embedding in MPI
- •Overview of the Functions
- •Topology Constructors
- •Cartesian Constructor
- •Cartesian Convenience Function: MPI_DIMS_CREATE
- •General (Graph) Constructor
- •Distributed (Graph) Constructor
- •Topology Inquiry Functions
- •Cartesian Shift Coordinates
- •Partitioning of Cartesian structures
- •Low-Level Topology Functions
- •An Application Example
- •MPI Environmental Management
- •Implementation Information
- •Version Inquiries
- •Environmental Inquiries
- •Tag Values
- •Host Rank
- •IO Rank
- •Clock Synchronization
- •Memory Allocation
- •Error Handling
- •Error Handlers for Communicators
- •Error Handlers for Windows
- •Error Handlers for Files
- •Freeing Errorhandlers and Retrieving Error Strings
- •Error Codes and Classes
- •Error Classes, Error Codes, and Error Handlers
- •Timers and Synchronization
- •Startup
- •Allowing User Functions at Process Termination
- •Determining Whether MPI Has Finished
- •Portable MPI Process Startup
- •The Info Object
- •Process Creation and Management
- •Introduction
- •The Dynamic Process Model
- •Starting Processes
- •The Runtime Environment
- •Process Manager Interface
- •Processes in MPI
- •Starting Processes and Establishing Communication
- •Reserved Keys
- •Spawn Example
- •Manager-worker Example, Using MPI_COMM_SPAWN.
- •Establishing Communication
- •Names, Addresses, Ports, and All That
- •Server Routines
- •Client Routines
- •Name Publishing
- •Reserved Key Values
- •Client/Server Examples
- •Ocean/Atmosphere - Relies on Name Publishing
- •Simple Client-Server Example.
- •Other Functionality
- •Universe Size
- •Singleton MPI_INIT
- •MPI_APPNUM
- •Releasing Connections
- •Another Way to Establish MPI Communication
- •One-Sided Communications
- •Introduction
- •Initialization
- •Window Creation
- •Window Attributes
- •Communication Calls
- •Examples
- •Accumulate Functions
- •Synchronization Calls
- •Fence
- •General Active Target Synchronization
- •Lock
- •Assertions
- •Examples
- •Error Handling
- •Error Handlers
- •Error Classes
- •Semantics and Correctness
- •Atomicity
- •Progress
- •Registers and Compiler Optimizations
- •External Interfaces
- •Introduction
- •Generalized Requests
- •Examples
- •Associating Information with Status
- •MPI and Threads
- •General
- •Initialization
- •Introduction
- •File Manipulation
- •Opening a File
- •Closing a File
- •Deleting a File
- •Resizing a File
- •Preallocating Space for a File
- •Querying the Size of a File
- •Querying File Parameters
- •File Info
- •Reserved File Hints
- •File Views
- •Data Access
- •Data Access Routines
- •Positioning
- •Synchronism
- •Coordination
- •Data Access Conventions
- •Data Access with Individual File Pointers
- •Data Access with Shared File Pointers
- •Noncollective Operations
- •Collective Operations
- •Seek
- •Split Collective Data Access Routines
- •File Interoperability
- •Datatypes for File Interoperability
- •Extent Callback
- •Datarep Conversion Functions
- •Matching Data Representations
- •Consistency and Semantics
- •File Consistency
- •Random Access vs. Sequential Files
- •Progress
- •Collective File Operations
- •Type Matching
- •Logical vs. Physical File Layout
- •File Size
- •Examples
- •Asynchronous I/O
- •I/O Error Handling
- •I/O Error Classes
- •Examples
- •Subarray Filetype Constructor
- •Requirements
- •Discussion
- •Logic of the Design
- •Examples
- •MPI Library Implementation
- •Systems with Weak Symbols
- •Systems Without Weak Symbols
- •Complications
- •Multiple Counting
- •Linker Oddities
- •Multiple Levels of Interception
- •Deprecated Functions
- •Deprecated since MPI-2.0
- •Deprecated since MPI-2.2
- •Language Bindings
- •Overview
- •Design
- •C++ Classes for MPI
- •Class Member Functions for MPI
- •Semantics
- •C++ Datatypes
- •Communicators
- •Exceptions
- •Mixed-Language Operability
- •Problems With Fortran Bindings for MPI
- •Problems Due to Strong Typing
- •Problems Due to Data Copying and Sequence Association
- •Special Constants
- •Fortran 90 Derived Types
- •A Problem with Register Optimization
- •Basic Fortran Support
- •Extended Fortran Support
- •The mpi Module
- •No Type Mismatch Problems for Subroutines with Choice Arguments
- •Additional Support for Fortran Numeric Intrinsic Types
- •Language Interoperability
- •Introduction
- •Assumptions
- •Initialization
- •Transfer of Handles
- •Status
- •MPI Opaque Objects
- •Datatypes
- •Callback Functions
- •Error Handlers
- •Reduce Operations
- •Addresses
- •Attributes
- •Extra State
- •Constants
- •Interlanguage Communication
- •Language Bindings Summary
- •Groups, Contexts, Communicators, and Caching Fortran Bindings
- •External Interfaces C++ Bindings
- •Change-Log
- •Bibliography
- •Examples Index
- •MPI Declarations Index
- •MPI Function Index
Chapter 1
Introduction to MPI
1.1 Overview and Goals
MPI (Message-Passing Interface) is a message-passing library interface speci cation. All parts of this de nition are signi cant. MPI addresses primarily the message-passing parallel programming model, in which data is moved from the address space of one process to that of another process through cooperative operations on each process. (Extensions to the \classical" message-passing model are provided in collective operations, remote-memory access operations, dynamic process creation, and parallel I/O.) MPI is a speci cation, not an implementation; there are multiple implementations of MPI. This speci cation is for a library interface; MPI is not a language, and all MPI operations are expressed as functions, subroutines, or methods, according to the appropriate language bindings, which for C, C++, Fortran-77, and Fortran-95, are part of the MPI standard. The standard has been de ned through an open process by a community of parallel computing vendors, computer scientists, and application developers. The next few sections provide an overview of the history of MPI's development.
The main advantages of establishing a message-passing standard are portability and ease of use. In a distributed memory communication environment in which the higher level routines and/or abstractions are built upon lower level message-passing routines the bene ts of standardization are particularly apparent. Furthermore, the de nition of a messagepassing standard, such as that proposed here, provides vendors with a clearly de ned base set of routines that they can implement e ciently, or in some cases provide hardware support for, thereby enhancing scalability.
The goal of the Message-Passing Interface simply stated is to develop a widely used standard for writing message-passing programs. As such the interface should establish a practical, portable, e cient, and exible standard for message passing.
A complete list of goals follows.
Design an application programming interface (not necessarily for compilers or a system implementation library).
Allow e cient communication: Avoid memory-to-memory copying, allow overlap of computation and communication, and o oad to communication co-processor, where available.
Allow for implementations that can be used in a heterogeneous environment.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
1
2 |
CHAPTER 1. INTRODUCTION TO MPI |
1Allow convenient C, C++, Fortran-77, and Fortran-95 bindings for the interface.
2
3Assume a reliable communication interface: the user need not cope with communica-
4tion failures. Such failures are dealt with by the underlying communication subsystem.
5
6
7
8
9
De ne an interface that can be implemented on many vendor's platforms, with no signi cant changes in the underlying communication and system software.
Semantics of the interface should be language independent.
10
11
The interface should be designed to allow for thread safety.
12 |
1.2 Background of MPI-1.0 |
|
|
13 |
|
14MPI sought to make use of the most attractive features of a number of existing message-
15passing systems, rather than selecting one of them and adopting it as the standard. Thus,
16MPI was strongly in uenced by work at the IBM T. J. Watson Research Center [1, 2], Intel's
17NX/2 [38], Express [12], nCUBE's Vertex [34], p4 [7, 8], and PARMACS [5, 9]. Other
18important contributions have come from Zipcode [40, 41], Chimp [16, 17], PVM [4, 14],
19Chameleon [25], and PICL [24].
20The MPI standardization e ort involved about 60 people from 40 organizations mainly
21from the United States and Europe. Most of the major vendors of concurrent computers
22were involved in MPI, along with researchers from universities, government laboratories, and
23industry. The standardization process began with the Workshop on Standards for Message-
24Passing in a Distributed Memory Environment, sponsored by the Center for Research on
25Parallel Computing, held April 29-30, 1992, in Williamsburg, Virginia [48]. At this workshop
26the basic features essential to a standard message-passing interface were discussed, and a
27working group established to continue the standardization process.
28A preliminary draft proposal, known as MPI1, was put forward by Dongarra, Hempel,
29Hey, and Walker in November 1992, and a revised version was completed in February
301993 [15]. MPI1 embodied the main features that were identi ed at the Williamsburg
31workshop as being necessary in a message passing standard. Since MPI1 was primarily
32intended to promote discussion and \get the ball rolling," it focused mainly on point-to-point
33communications. MPI1 brought to the forefront a number of important standardization
34issues, but did not include any collective communication routines and was not thread-safe.
35In November 1992, a meeting of the MPI working group was held in Minneapolis, at
36which it was decided to place the standardization process on a more formal footing, and to
37generally adopt the procedures and organization of the High Performance Fortran Forum.
38Subcommittees were formed for the major component areas of the standard, and an email
39discussion service established for each. In addition, the goal of producing a draft MPI
40standard by the Fall of 1993 was set. To achieve this goal the MPI working group met every
416 weeks for two days throughout the rst 9 months of 1993, and presented the draft MPI
42standard at the Supercomputing 93 conference in November 1993. These meetings and the
43email discussion together constituted the MPI Forum, membership of which has been open
44to all members of the high performance computing community.
45
46
47
48
1.3. BACKGROUND OF MPI-1.1, MPI-1.2, AND MPI-2.0 |
3 |
1.3 Background of MPI-1.1, MPI-1.2, and MPI-2.0
Beginning in March 1995, the MPI Forum began meeting to consider corrections and extensions to the original MPI Standard document [21]. The rst product of these deliberations was Version 1.1 of the MPI speci cation, released in June of 1995 [22] (see http://www.mpi-forum.org for o cial MPI document releases). At that time, e ort focused in ve areas.
1.Further corrections and clari cations for the MPI-1.1 document.
2.Additions to MPI-1.1 that do not signi cantly change its types of functionality (new datatype constructors, language interoperability, etc.).
3.Completely new types of functionality (dynamic processes, one-sided communication, parallel I/O, etc.) that are what everyone thinks of as \MPI-2 functionality."
4.Bindings for Fortran 90 and C++. MPI-2 speci es C++ bindings for both MPI-1 and MPI-2 functions, and extensions to the Fortran 77 binding of MPI-1 and MPI-2 to handle Fortran 90 issues.
5.Discussions of areas in which the MPI process and framework seem likely to be useful, but where more discussion and experience are needed before standardization (e.g. zero-copy semantics on shared-memory machines, real-time speci cations).
Corrections and clari cations (items of type 1 in the above list) were collected in Chapter 3 of the MPI-2 document: \Version 1.2 of MPI." That chapter also contains the function for identifying the version number. Additions to MPI-1.1 (items of types 2, 3, and 4 in the above list) are in the remaining chapters of the MPI-2 document, and constitute the speci - cation for MPI-2. Items of type 5 in the above list have been moved to a separate document, the \MPI Journal of Development" (JOD), and are not part of the MPI-2 Standard.
This structure makes it easy for users and implementors to understand what level of MPI compliance a given implementation has:
MPI-1 compliance will mean compliance with MPI-1.3. This is a useful level of compliance. It means that the implementation conforms to the clari cations of MPI-1.1 function behavior given in Chapter 3 of the MPI-2 document. Some implementations may require changes to be MPI-1 compliant.
MPI-2 compliance will mean compliance with all of MPI-2.1.
The MPI Journal of Development is not part of the MPI Standard.
It is to be emphasized that forward compatibility is preserved. That is, a valid MPI-1.1 program is both a valid MPI-1.3 program and a valid MPI-2.1 program, and a valid MPI-1.3 program is a valid MPI-2.1 program.
1.4 Background of MPI-1.3 and MPI-2.1
After the release of MPI-2.0, the MPI Forum kept working on errata and clari cations for both standard documents (MPI-1.1 and MPI-2.0). The short document \Errata for MPI-1.1" was released October 12, 1998. On July 5, 2001, a rst ballot of errata and clari cations for
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
4 |
CHAPTER 1. INTRODUCTION TO MPI |
1MPI-2.0 was released, and a second ballot was voted on May 22, 2002. Both votes were done
2electronically. Both ballots were combined into one document: \Errata for MPI-2", May
315, 2002. This errata process was then interrupted, but the Forum and its e-mail re ectors
4kept working on new requests for clari cation.
5Restarting regular work of the MPI Forum was initiated in three meetings, at Eu-
6roPVM/MPI'06 in Bonn, at EuroPVM/MPI'07 in Paris, and at SC'07 in Reno. In De-
7cember 2007, a steering committee started the organization of new MPI Forum meetings at
8regular 8-weeks intervals. At the January 14-16, 2008 meeting in Chicago, the MPI Forum
9decided to combine the existing and future MPI documents to one single document for each
10version of the MPI standard. For technical and historical reasons, this series was started
11with MPI-1.3. Additional Ballots 3 and 4 solved old questions from the errata list started
12in 1995 up to new questions from the last years. After all documents (MPI-1.1, MPI-2,
13Errata for MPI-1.1 (Oct. 12, 1998), and MPI-2.1 Ballots 1-4) were combined into one draft
14document, for each chapter, a chapter author and review team were de ned. They cleaned
15up the document to achieve a consistent MPI-2.1 document. The nal MPI-2.1 standard
16document was nished in June 2008, and nally released with a second vote in September
172008 in the meeting at Dublin, just before EuroPVM/MPI'08. The major work of the
18current MPI Forum is the preparation of MPI-3.
19
20
21 1.5 Background of MPI-2.2
22
MPI-2.2 is a minor update to the MPI-2.1 standard. This version addresses additional errors
23
and ambiguities that were not corrected in the MPI-2.1 standard as well as a small number
24
of extensions to MPI-2.1 that met the following criteria:
25
26
27
Any correct MPI-2.1 program is a correct MPI-2.2 program.
28
29
30
31
Any extension must have signi cant bene t for users.
Any extension must not require signi cant implementation e ort. To that end, all such changes are accompanied by an open source implementation.
32The discussions of MPI-2.2 proceeded concurrently with the MPI-3 discussions; in some
33cases, extensions were proposed for MPI-2.2 but were later moved to MPI-3.
1.6 Who Should Use This Standard?
36 |
|
|
37 |
This standard is intended for use by all those who want to write portable message-passing |
|
38 |
||
programs in Fortran, C and C++. This includes individual application programmers, de- |
||
39 |
||
velopers of software designed to run on parallel machines, and creators of environments |
||
40 |
||
and tools. In order to be attractive to this wide audience, the standard must provide a |
||
41 |
||
simple, easy-to-use interface for the basic user while not semantically precluding the high- |
||
42 |
||
performance message-passing operations available on advanced machines. |
||
43 |
||
|
||
44 |
1.7 What Platforms Are Targets For Implementation? |
|
45 |
||
|
46
47The attractiveness of the message-passing paradigm at least partially stems from its wide
48portability. Programs expressed this way may run on distributed-memory multiprocessors,