- •Contents
- •List of Figures
- •List of Tables
- •Acknowledgments
- •Introduction to MPI
- •Overview and Goals
- •Background of MPI-1.0
- •Background of MPI-1.1, MPI-1.2, and MPI-2.0
- •Background of MPI-1.3 and MPI-2.1
- •Background of MPI-2.2
- •Who Should Use This Standard?
- •What Platforms Are Targets For Implementation?
- •What Is Included In The Standard?
- •What Is Not Included In The Standard?
- •Organization of this Document
- •MPI Terms and Conventions
- •Document Notation
- •Naming Conventions
- •Semantic Terms
- •Data Types
- •Opaque Objects
- •Array Arguments
- •State
- •Named Constants
- •Choice
- •Addresses
- •Language Binding
- •Deprecated Names and Functions
- •Fortran Binding Issues
- •C Binding Issues
- •C++ Binding Issues
- •Functions and Macros
- •Processes
- •Error Handling
- •Implementation Issues
- •Independence of Basic Runtime Routines
- •Interaction with Signals
- •Examples
- •Point-to-Point Communication
- •Introduction
- •Blocking Send and Receive Operations
- •Blocking Send
- •Message Data
- •Message Envelope
- •Blocking Receive
- •Return Status
- •Passing MPI_STATUS_IGNORE for Status
- •Data Type Matching and Data Conversion
- •Type Matching Rules
- •Type MPI_CHARACTER
- •Data Conversion
- •Communication Modes
- •Semantics of Point-to-Point Communication
- •Buffer Allocation and Usage
- •Nonblocking Communication
- •Communication Request Objects
- •Communication Initiation
- •Communication Completion
- •Semantics of Nonblocking Communications
- •Multiple Completions
- •Non-destructive Test of status
- •Probe and Cancel
- •Persistent Communication Requests
- •Send-Receive
- •Null Processes
- •Datatypes
- •Derived Datatypes
- •Type Constructors with Explicit Addresses
- •Datatype Constructors
- •Subarray Datatype Constructor
- •Distributed Array Datatype Constructor
- •Address and Size Functions
- •Lower-Bound and Upper-Bound Markers
- •Extent and Bounds of Datatypes
- •True Extent of Datatypes
- •Commit and Free
- •Duplicating a Datatype
- •Use of General Datatypes in Communication
- •Correct Use of Addresses
- •Decoding a Datatype
- •Examples
- •Pack and Unpack
- •Canonical MPI_PACK and MPI_UNPACK
- •Collective Communication
- •Introduction and Overview
- •Communicator Argument
- •Applying Collective Operations to Intercommunicators
- •Barrier Synchronization
- •Broadcast
- •Example using MPI_BCAST
- •Gather
- •Examples using MPI_GATHER, MPI_GATHERV
- •Scatter
- •Examples using MPI_SCATTER, MPI_SCATTERV
- •Example using MPI_ALLGATHER
- •All-to-All Scatter/Gather
- •Global Reduction Operations
- •Reduce
- •Signed Characters and Reductions
- •MINLOC and MAXLOC
- •All-Reduce
- •Process-local reduction
- •Reduce-Scatter
- •MPI_REDUCE_SCATTER_BLOCK
- •MPI_REDUCE_SCATTER
- •Scan
- •Inclusive Scan
- •Exclusive Scan
- •Example using MPI_SCAN
- •Correctness
- •Introduction
- •Features Needed to Support Libraries
- •MPI's Support for Libraries
- •Basic Concepts
- •Groups
- •Contexts
- •Intra-Communicators
- •Group Management
- •Group Accessors
- •Group Constructors
- •Group Destructors
- •Communicator Management
- •Communicator Accessors
- •Communicator Constructors
- •Communicator Destructors
- •Motivating Examples
- •Current Practice #1
- •Current Practice #2
- •(Approximate) Current Practice #3
- •Example #4
- •Library Example #1
- •Library Example #2
- •Inter-Communication
- •Inter-communicator Accessors
- •Inter-communicator Operations
- •Inter-Communication Examples
- •Caching
- •Functionality
- •Communicators
- •Windows
- •Datatypes
- •Error Class for Invalid Keyval
- •Attributes Example
- •Naming Objects
- •Formalizing the Loosely Synchronous Model
- •Basic Statements
- •Models of Execution
- •Static communicator allocation
- •Dynamic communicator allocation
- •The General case
- •Process Topologies
- •Introduction
- •Virtual Topologies
- •Embedding in MPI
- •Overview of the Functions
- •Topology Constructors
- •Cartesian Constructor
- •Cartesian Convenience Function: MPI_DIMS_CREATE
- •General (Graph) Constructor
- •Distributed (Graph) Constructor
- •Topology Inquiry Functions
- •Cartesian Shift Coordinates
- •Partitioning of Cartesian structures
- •Low-Level Topology Functions
- •An Application Example
- •MPI Environmental Management
- •Implementation Information
- •Version Inquiries
- •Environmental Inquiries
- •Tag Values
- •Host Rank
- •IO Rank
- •Clock Synchronization
- •Memory Allocation
- •Error Handling
- •Error Handlers for Communicators
- •Error Handlers for Windows
- •Error Handlers for Files
- •Freeing Errorhandlers and Retrieving Error Strings
- •Error Codes and Classes
- •Error Classes, Error Codes, and Error Handlers
- •Timers and Synchronization
- •Startup
- •Allowing User Functions at Process Termination
- •Determining Whether MPI Has Finished
- •Portable MPI Process Startup
- •The Info Object
- •Process Creation and Management
- •Introduction
- •The Dynamic Process Model
- •Starting Processes
- •The Runtime Environment
- •Process Manager Interface
- •Processes in MPI
- •Starting Processes and Establishing Communication
- •Reserved Keys
- •Spawn Example
- •Manager-worker Example, Using MPI_COMM_SPAWN.
- •Establishing Communication
- •Names, Addresses, Ports, and All That
- •Server Routines
- •Client Routines
- •Name Publishing
- •Reserved Key Values
- •Client/Server Examples
- •Ocean/Atmosphere - Relies on Name Publishing
- •Simple Client-Server Example.
- •Other Functionality
- •Universe Size
- •Singleton MPI_INIT
- •MPI_APPNUM
- •Releasing Connections
- •Another Way to Establish MPI Communication
- •One-Sided Communications
- •Introduction
- •Initialization
- •Window Creation
- •Window Attributes
- •Communication Calls
- •Examples
- •Accumulate Functions
- •Synchronization Calls
- •Fence
- •General Active Target Synchronization
- •Lock
- •Assertions
- •Examples
- •Error Handling
- •Error Handlers
- •Error Classes
- •Semantics and Correctness
- •Atomicity
- •Progress
- •Registers and Compiler Optimizations
- •External Interfaces
- •Introduction
- •Generalized Requests
- •Examples
- •Associating Information with Status
- •MPI and Threads
- •General
- •Initialization
- •Introduction
- •File Manipulation
- •Opening a File
- •Closing a File
- •Deleting a File
- •Resizing a File
- •Preallocating Space for a File
- •Querying the Size of a File
- •Querying File Parameters
- •File Info
- •Reserved File Hints
- •File Views
- •Data Access
- •Data Access Routines
- •Positioning
- •Synchronism
- •Coordination
- •Data Access Conventions
- •Data Access with Individual File Pointers
- •Data Access with Shared File Pointers
- •Noncollective Operations
- •Collective Operations
- •Seek
- •Split Collective Data Access Routines
- •File Interoperability
- •Datatypes for File Interoperability
- •Extent Callback
- •Datarep Conversion Functions
- •Matching Data Representations
- •Consistency and Semantics
- •File Consistency
- •Random Access vs. Sequential Files
- •Progress
- •Collective File Operations
- •Type Matching
- •Logical vs. Physical File Layout
- •File Size
- •Examples
- •Asynchronous I/O
- •I/O Error Handling
- •I/O Error Classes
- •Examples
- •Subarray Filetype Constructor
- •Requirements
- •Discussion
- •Logic of the Design
- •Examples
- •MPI Library Implementation
- •Systems with Weak Symbols
- •Systems Without Weak Symbols
- •Complications
- •Multiple Counting
- •Linker Oddities
- •Multiple Levels of Interception
- •Deprecated Functions
- •Deprecated since MPI-2.0
- •Deprecated since MPI-2.2
- •Language Bindings
- •Overview
- •Design
- •C++ Classes for MPI
- •Class Member Functions for MPI
- •Semantics
- •C++ Datatypes
- •Communicators
- •Exceptions
- •Mixed-Language Operability
- •Problems With Fortran Bindings for MPI
- •Problems Due to Strong Typing
- •Problems Due to Data Copying and Sequence Association
- •Special Constants
- •Fortran 90 Derived Types
- •A Problem with Register Optimization
- •Basic Fortran Support
- •Extended Fortran Support
- •The mpi Module
- •No Type Mismatch Problems for Subroutines with Choice Arguments
- •Additional Support for Fortran Numeric Intrinsic Types
- •Language Interoperability
- •Introduction
- •Assumptions
- •Initialization
- •Transfer of Handles
- •Status
- •MPI Opaque Objects
- •Datatypes
- •Callback Functions
- •Error Handlers
- •Reduce Operations
- •Addresses
- •Attributes
- •Extra State
- •Constants
- •Interlanguage Communication
- •Language Bindings Summary
- •Groups, Contexts, Communicators, and Caching Fortran Bindings
- •External Interfaces C++ Bindings
- •Change-Log
- •Bibliography
- •Examples Index
- •MPI Declarations Index
- •MPI Function Index
26 |
CHAPTER 3. POINT-TO-POINT COMMUNICATION |
1operation to select a particular message. The last three parameters of the send operation,
2along with the rank of the sender, specify the envelope for the message sent. Process one
3(myrank = 1) receives this message with the receive operation MPI_RECV. The message to
4be received is selected according to the value of its envelope, and the message data is stored
5into the receive bu er. In the example above, the receive bu er consists of the storage
6containing the string message in the memory of process one. The rst three parameters
7of the receive operation specify the location, size and type of the receive bu er. The next
8three parameters are used for selecting the incoming message. The last parameter is used
9to return information on the message just received.
10The next sections describe the blocking send and receive operations. We discuss send,
11receive, blocking communication semantics, type matching requirements, type conversion
12in heterogeneous environments, and more general communication modes. Nonblocking
13 |
communication is addressed next, followed by channel-like constructs and send-receive |
|
14operations, Nonblocking communication is addressed next, followed by channel-like con-
15structs and send-receive operations, ending with a description of the \dummy" process,
16MPI_PROC_NULL.
17
18
19
20
21
3.2 Blocking Send and Receive Operations
3.2.1 Blocking Send
22 |
The syntax of the blocking send operation is given below. |
|
23
24
25
26
27
28
29
30
31
32
33
34
35
MPI_SEND(buf, count, datatype, dest, tag, comm)
IN |
buf |
initial address of send bu er (choice) |
IN |
count |
number of elements in send bu er (non-negative inte- |
|
|
ger) |
IN |
datatype |
datatype of each send bu er element (handle) |
IN |
dest |
rank of destination (integer) |
IN |
tag |
message tag (integer) |
IN |
comm |
communicator (handle) |
36 |
int MPI_Send(void* buf, int count, MPI_Datatype datatype, int dest, |
37 |
int tag, MPI_Comm comm) |
38
MPI_SEND(BUF, COUNT, DATATYPE, DEST, TAG, COMM, IERROR)
39
<type> BUF(*)
40
INTEGER COUNT, DATATYPE, DEST, TAG, COMM, IERROR
3.2. BLOCKING SEND AND RECEIVE OPERATIONS |
27 |
3.2.2 Message Data
The send bu er speci ed by the MPI_SEND operation consists of count successive entries of the type indicated by datatype, starting with the entry at address buf. Note that we specify the message length in terms of number of elements, not number of bytes. The former is machine independent and closer to the application level.
The data part of the message consists of a sequence of count values, each of the type indicated by datatype. count may be zero, in which case the data part of the message is empty. The basic datatypes that can be speci ed for message data values correspond to the basic datatypes of the host language. Possible values of this argument for Fortran and the corresponding Fortran types are listed in Table 3.1.
MPI datatype |
Fortran datatype |
MPI_INTEGER |
INTEGER |
MPI_REAL |
REAL |
MPI_DOUBLE_PRECISION |
DOUBLE PRECISION |
MPI_COMPLEX |
COMPLEX |
MPI_LOGICAL |
LOGICAL |
MPI_CHARACTER |
CHARACTER(1) |
MPI_BYTE |
|
MPI_PACKED |
|
Table 3.1: Prede ned MPI datatypes corresponding to Fortran datatypes
Possible values for this argument for C and the corresponding C types are listed in Table 3.2.
The datatypes MPI_BYTE and MPI_PACKED do not correspond to a Fortran or C datatype. A value of type MPI_BYTE consists of a byte (8 binary digits). A byte is uninterpreted and is di erent from a character. Di erent machines may have di erent representations for characters, or may use more than one byte to represent characters. On the other hand, a byte has the same binary value on all machines. The use of the type MPI_PACKED is explained in Section 4.2.
MPI requires support of these datatypes, which match the basic datatypes of Fortran and ISO C. Additional MPI datatypes should be provided if the host language has additional data types: MPI_DOUBLE_COMPLEX for double precision complex in Fortran declared to be of type DOUBLE COMPLEX; MPI_REAL2, MPI_REAL4 and MPI_REAL8 for Fortran reals, declared to be of type REAL*2, REAL*4 and REAL*8, respectively; MPI_INTEGER1 MPI_INTEGER2 and MPI_INTEGER4 for Fortran integers, declared to be of type INTEGER*1, INTEGER*2 and INTEGER*4, respectively; etc.
Rationale. One goal of the design is to allow for MPI to be implemented as a library, with no need for additional preprocessing or compilation. Thus, one cannot assume that a communication call has information on the datatype of variables in the communication bu er; this information must be supplied by an explicit argument. The need for such datatype information will become clear in Section 3.3.2. (End of rationale.)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
28 |
CHAPTER 3. POINT-TO-POINT COMMUNICATION |
||
|
|
|
|
|
MPI datatype |
C datatype |
|
|
MPI_CHAR |
char |
|
|
|
(treated as printable character) |
|
|
MPI_SHORT |
signed short int |
|
|
MPI_INT |
signed int |
|
|
MPI_LONG |
signed long int |
|
|
MPI_LONG_LONG_INT |
signed long long int |
|
|
MPI_LONG_LONG (as a synonym) |
signed long long int |
|
|
MPI_SIGNED_CHAR |
signed char |
|
|
|
(treated as integral value) |
|
|
MPI_UNSIGNED_CHAR |
unsigned char |
|
|
|
(treated as integral value) |
|
|
MPI_UNSIGNED_SHORT |
unsigned short int |
|
|
MPI_UNSIGNED |
unsigned int |
|
|
MPI_UNSIGNED_LONG |
unsigned long int |
|
|
MPI_UNSIGNED_LONG_LONG |
unsigned long long int |
|
|
MPI_FLOAT |
float |
|
|
MPI_DOUBLE |
double |
|
|
MPI_LONG_DOUBLE |
long double |
|
|
MPI_WCHAR |
wchar_t |
|
|
|
(de ned in <stddef.h>) |
|
|
|
(treated as printable character) |
|
|
MPI_C_BOOL |
_Bool |
|
|
MPI_INT8_T |
int8_t |
|
|
MPI_INT16_T |
int16_t |
|
|
MPI_INT32_T |
int32_t |
|
|
MPI_INT64_T |
int64_t |
|
|
MPI_UINT8_T |
uint8_t |
|
|
MPI_UINT16_T |
uint16_t |
|
|
MPI_UINT32_T |
uint32_t |
|
|
MPI_UINT64_T |
uint64_t |
|
|
MPI_C_COMPLEX |
float _Complex |
|
|
MPI_C_FLOAT_COMPLEX (as a synonym) |
float _Complex |
|
|
MPI_C_DOUBLE_COMPLEX |
double _Complex |
|
|
MPI_C_LONG_DOUBLE_COMPLEX |
long double _Complex |
|
|
MPI_BYTE |
|
|
|
MPI_PACKED |
|
|
|
|
|
|
40
41
Table 3.2: Prede ned MPI datatypes corresponding to C datatypes
42Rationale. The datatypes MPI_C_BOOL, MPI_INT8_T, MPI_INT16_T,
43MPI_INT32_T, MPI_UINT8_T, MPI_UINT16_T, MPI_UINT32_T, MPI_C_COMPLEX,
44MPI_C_FLOAT_COMPLEX, MPI_C_DOUBLE_COMPLEX, and
45MPI_C_LONG_DOUBLE_COMPLEX have no corresponding C++ bindings. This was
46intentionally done to avoid potential collisions with the C preprocessor and names-
47paced C++ names. C++ applications can use the C bindings with no loss of func-
48tionality. (End of rationale.)
3.2. BLOCKING SEND AND RECEIVE OPERATIONS |
29 |
||||
|
|
|
|
|
|
|
MPI datatype |
C datatype |
Fortran datatype |
|
|
|
MPI_AINT |
MPI_Aint |
INTEGER |
(KIND=MPI_ADDRESS_KIND) |
|
|
MPI_OFFSET |
MPI_Offset |
INTEGER |
(KIND=MPI_OFFSET_KIND) |
|
|
|
|
|
|
|
Table 3.3: Prede ned MPI datatypes corresponding to both C and Fortran datatypes
The datatypes MPI_AINT and MPI_OFFSET correspond to the MPI-de ned C types MPI_Aint and MPI_O set and their Fortran equivalents INTEGER (KIND=
MPI_ADDRESS_KIND) and INTEGER (KIND=MPI_OFFSET_KIND). This is described in Table 3.3. See Section 16.3.10 for information on interlanguage communication with these types.
3.2.3 Message Envelope
In addition to the data part, messages carry information that can be used to distinguish messages and selectively receive them. This information consists of a xed number of elds, which we collectively call the message envelope. These elds are
source destination tag communicator
The message source is implicitly determined by the identity of the message sender. The other elds are speci ed by arguments in the send operation.
The message destination is speci ed by the dest argument.
The integer-valued message tag is speci ed by the tag argument. This integer can be used by the program to distinguish di erent types of messages. The range of valid tag values is 0,...,UB, where the value of UB is implementation dependent. It can be found by querying the value of the attribute MPI_TAG_UB, as described in Chapter 8. MPI requires that UB be no less than 32767.
The comm argument speci es the communicator that is used for the send operation. Communicators are explained in Chapter 6; below is a brief summary of their usage.
A communicator speci es the communication context for a communication operation. Each communication context provides a separate \communication universe:" messages are always received within the context they were sent, and messages sent in di erent contexts do not interfere.
The communicator also speci es the set of processes that share this communication context. This process group is ordered and processes are identi ed by their rank within this group. Thus, the range of valid values for dest is 0, ... , n-1, where n is the number of processes in the group. (If the communicator is an inter-communicator, then destinations are identi ed by their rank in the remote group. See Chapter 6.)
A prede ned communicator MPI_COMM_WORLD is provided by MPI. It allows communication with all processes that are accessible after MPI initialization and processes are identi ed by their rank in the group of MPI_COMM_WORLD.
Advice to users. Users that are comfortable with the notion of a at name space for processes, and a single communication context, as o ered by most existing communication libraries, need only use the prede ned variable MPI_COMM_WORLD as the
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48