- •Contents
- •List of Figures
- •List of Tables
- •Acknowledgments
- •Introduction to MPI
- •Overview and Goals
- •Background of MPI-1.0
- •Background of MPI-1.1, MPI-1.2, and MPI-2.0
- •Background of MPI-1.3 and MPI-2.1
- •Background of MPI-2.2
- •Who Should Use This Standard?
- •What Platforms Are Targets For Implementation?
- •What Is Included In The Standard?
- •What Is Not Included In The Standard?
- •Organization of this Document
- •MPI Terms and Conventions
- •Document Notation
- •Naming Conventions
- •Semantic Terms
- •Data Types
- •Opaque Objects
- •Array Arguments
- •State
- •Named Constants
- •Choice
- •Addresses
- •Language Binding
- •Deprecated Names and Functions
- •Fortran Binding Issues
- •C Binding Issues
- •C++ Binding Issues
- •Functions and Macros
- •Processes
- •Error Handling
- •Implementation Issues
- •Independence of Basic Runtime Routines
- •Interaction with Signals
- •Examples
- •Point-to-Point Communication
- •Introduction
- •Blocking Send and Receive Operations
- •Blocking Send
- •Message Data
- •Message Envelope
- •Blocking Receive
- •Return Status
- •Passing MPI_STATUS_IGNORE for Status
- •Data Type Matching and Data Conversion
- •Type Matching Rules
- •Type MPI_CHARACTER
- •Data Conversion
- •Communication Modes
- •Semantics of Point-to-Point Communication
- •Buffer Allocation and Usage
- •Nonblocking Communication
- •Communication Request Objects
- •Communication Initiation
- •Communication Completion
- •Semantics of Nonblocking Communications
- •Multiple Completions
- •Non-destructive Test of status
- •Probe and Cancel
- •Persistent Communication Requests
- •Send-Receive
- •Null Processes
- •Datatypes
- •Derived Datatypes
- •Type Constructors with Explicit Addresses
- •Datatype Constructors
- •Subarray Datatype Constructor
- •Distributed Array Datatype Constructor
- •Address and Size Functions
- •Lower-Bound and Upper-Bound Markers
- •Extent and Bounds of Datatypes
- •True Extent of Datatypes
- •Commit and Free
- •Duplicating a Datatype
- •Use of General Datatypes in Communication
- •Correct Use of Addresses
- •Decoding a Datatype
- •Examples
- •Pack and Unpack
- •Canonical MPI_PACK and MPI_UNPACK
- •Collective Communication
- •Introduction and Overview
- •Communicator Argument
- •Applying Collective Operations to Intercommunicators
- •Barrier Synchronization
- •Broadcast
- •Example using MPI_BCAST
- •Gather
- •Examples using MPI_GATHER, MPI_GATHERV
- •Scatter
- •Examples using MPI_SCATTER, MPI_SCATTERV
- •Example using MPI_ALLGATHER
- •All-to-All Scatter/Gather
- •Global Reduction Operations
- •Reduce
- •Signed Characters and Reductions
- •MINLOC and MAXLOC
- •All-Reduce
- •Process-local reduction
- •Reduce-Scatter
- •MPI_REDUCE_SCATTER_BLOCK
- •MPI_REDUCE_SCATTER
- •Scan
- •Inclusive Scan
- •Exclusive Scan
- •Example using MPI_SCAN
- •Correctness
- •Introduction
- •Features Needed to Support Libraries
- •MPI's Support for Libraries
- •Basic Concepts
- •Groups
- •Contexts
- •Intra-Communicators
- •Group Management
- •Group Accessors
- •Group Constructors
- •Group Destructors
- •Communicator Management
- •Communicator Accessors
- •Communicator Constructors
- •Communicator Destructors
- •Motivating Examples
- •Current Practice #1
- •Current Practice #2
- •(Approximate) Current Practice #3
- •Example #4
- •Library Example #1
- •Library Example #2
- •Inter-Communication
- •Inter-communicator Accessors
- •Inter-communicator Operations
- •Inter-Communication Examples
- •Caching
- •Functionality
- •Communicators
- •Windows
- •Datatypes
- •Error Class for Invalid Keyval
- •Attributes Example
- •Naming Objects
- •Formalizing the Loosely Synchronous Model
- •Basic Statements
- •Models of Execution
- •Static communicator allocation
- •Dynamic communicator allocation
- •The General case
- •Process Topologies
- •Introduction
- •Virtual Topologies
- •Embedding in MPI
- •Overview of the Functions
- •Topology Constructors
- •Cartesian Constructor
- •Cartesian Convenience Function: MPI_DIMS_CREATE
- •General (Graph) Constructor
- •Distributed (Graph) Constructor
- •Topology Inquiry Functions
- •Cartesian Shift Coordinates
- •Partitioning of Cartesian structures
- •Low-Level Topology Functions
- •An Application Example
- •MPI Environmental Management
- •Implementation Information
- •Version Inquiries
- •Environmental Inquiries
- •Tag Values
- •Host Rank
- •IO Rank
- •Clock Synchronization
- •Memory Allocation
- •Error Handling
- •Error Handlers for Communicators
- •Error Handlers for Windows
- •Error Handlers for Files
- •Freeing Errorhandlers and Retrieving Error Strings
- •Error Codes and Classes
- •Error Classes, Error Codes, and Error Handlers
- •Timers and Synchronization
- •Startup
- •Allowing User Functions at Process Termination
- •Determining Whether MPI Has Finished
- •Portable MPI Process Startup
- •The Info Object
- •Process Creation and Management
- •Introduction
- •The Dynamic Process Model
- •Starting Processes
- •The Runtime Environment
- •Process Manager Interface
- •Processes in MPI
- •Starting Processes and Establishing Communication
- •Reserved Keys
- •Spawn Example
- •Manager-worker Example, Using MPI_COMM_SPAWN.
- •Establishing Communication
- •Names, Addresses, Ports, and All That
- •Server Routines
- •Client Routines
- •Name Publishing
- •Reserved Key Values
- •Client/Server Examples
- •Ocean/Atmosphere - Relies on Name Publishing
- •Simple Client-Server Example.
- •Other Functionality
- •Universe Size
- •Singleton MPI_INIT
- •MPI_APPNUM
- •Releasing Connections
- •Another Way to Establish MPI Communication
- •One-Sided Communications
- •Introduction
- •Initialization
- •Window Creation
- •Window Attributes
- •Communication Calls
- •Examples
- •Accumulate Functions
- •Synchronization Calls
- •Fence
- •General Active Target Synchronization
- •Lock
- •Assertions
- •Examples
- •Error Handling
- •Error Handlers
- •Error Classes
- •Semantics and Correctness
- •Atomicity
- •Progress
- •Registers and Compiler Optimizations
- •External Interfaces
- •Introduction
- •Generalized Requests
- •Examples
- •Associating Information with Status
- •MPI and Threads
- •General
- •Initialization
- •Introduction
- •File Manipulation
- •Opening a File
- •Closing a File
- •Deleting a File
- •Resizing a File
- •Preallocating Space for a File
- •Querying the Size of a File
- •Querying File Parameters
- •File Info
- •Reserved File Hints
- •File Views
- •Data Access
- •Data Access Routines
- •Positioning
- •Synchronism
- •Coordination
- •Data Access Conventions
- •Data Access with Individual File Pointers
- •Data Access with Shared File Pointers
- •Noncollective Operations
- •Collective Operations
- •Seek
- •Split Collective Data Access Routines
- •File Interoperability
- •Datatypes for File Interoperability
- •Extent Callback
- •Datarep Conversion Functions
- •Matching Data Representations
- •Consistency and Semantics
- •File Consistency
- •Random Access vs. Sequential Files
- •Progress
- •Collective File Operations
- •Type Matching
- •Logical vs. Physical File Layout
- •File Size
- •Examples
- •Asynchronous I/O
- •I/O Error Handling
- •I/O Error Classes
- •Examples
- •Subarray Filetype Constructor
- •Requirements
- •Discussion
- •Logic of the Design
- •Examples
- •MPI Library Implementation
- •Systems with Weak Symbols
- •Systems Without Weak Symbols
- •Complications
- •Multiple Counting
- •Linker Oddities
- •Multiple Levels of Interception
- •Deprecated Functions
- •Deprecated since MPI-2.0
- •Deprecated since MPI-2.2
- •Language Bindings
- •Overview
- •Design
- •C++ Classes for MPI
- •Class Member Functions for MPI
- •Semantics
- •C++ Datatypes
- •Communicators
- •Exceptions
- •Mixed-Language Operability
- •Problems With Fortran Bindings for MPI
- •Problems Due to Strong Typing
- •Problems Due to Data Copying and Sequence Association
- •Special Constants
- •Fortran 90 Derived Types
- •A Problem with Register Optimization
- •Basic Fortran Support
- •Extended Fortran Support
- •The mpi Module
- •No Type Mismatch Problems for Subroutines with Choice Arguments
- •Additional Support for Fortran Numeric Intrinsic Types
- •Language Interoperability
- •Introduction
- •Assumptions
- •Initialization
- •Transfer of Handles
- •Status
- •MPI Opaque Objects
- •Datatypes
- •Callback Functions
- •Error Handlers
- •Reduce Operations
- •Addresses
- •Attributes
- •Extra State
- •Constants
- •Interlanguage Communication
- •Language Bindings Summary
- •Groups, Contexts, Communicators, and Caching Fortran Bindings
- •External Interfaces C++ Bindings
- •Change-Log
- •Bibliography
- •Examples Index
- •MPI Declarations Index
- •MPI Function Index
64 |
CHAPTER 3. POINT-TO-POINT COMMUNICATION |
1DO WHILE(.TRUE.)
2CALL MPI_WAITSOME(size, request_list, numdone,
3 |
indices, statuses, ierr) |
|
4DO i=1, numdone
5 |
CALL DO_SERVICE(a(1, indices(i))) |
|
|
6 |
CALL MPI_IRECV(a(1, indices(i)), n, MPI_REAL, 0, tag, |
|
|
7 |
comm, request_list(indices(i)), ierr) |
|
8END DO
9END DO
10 |
END IF |
|
|
11 |
|
12 |
3.7.6 Non-destructive Test of status |
|
|
13 |
|
14This call is useful for accessing the information associated with a request, without freeing
15the request (in case the user is expected to access it later). It allows one to layer libraries
16more conveniently, since multiple layers of software may access the same completed request
17and extract from it the status information.
18
19
MPI_REQUEST_GET_STATUS( request, ag, status )
20
21
22
23
24
25
IN |
request |
request (handle) |
OUT |
ag |
boolean ag, same as from MPI_TEST (logical) |
OUT |
status |
MPI_STATUS object if ag is true (Status) |
26 |
int MPI_Request_get_status(MPI_Request request, int *flag, |
|
27 |
MPI_Status *status) |
|
28 |
MPI_REQUEST_GET_STATUS( REQUEST, FLAG, STATUS, IERROR) |
|
|
||
29 |
INTEGER REQUEST, STATUS(MPI_STATUS_SIZE), IERROR |
|
|
||
30 |
LOGICAL FLAG |
|
|
||
31 |
fbool MPI::Request::Get_status(MPI::Status& status) const (binding deprecated, |
|
32 |
||
33 |
see Section 15.2) g |
|
34 |
fbool MPI::Request::Get_status() const (binding deprecated, see Section 15.2) g |
|
35 |
||
|
36Sets ag=true if the operation is complete, and, if so, returns in status the request
37status. However, unlike test or wait, it does not deallocate or inactivate the request; a
38subsequent call to test, wait or free should be executed with that request. It sets ag=false
39if the operation is not complete.
40One is allowed to call MPI_REQUEST_GET_STATUS with a null or inactive request
41argument. In such a case the operation returns with ag=true and empty status.
42 |
|
|
43 |
3.8 Probe and Cancel |
|
44 |
||
|
45The MPI_PROBE and MPI_IPROBE operations allow incoming messages to be checked for,
46without actually receiving them. The user can then decide how to receive them, based on
47the information returned by the probe (basically, the information returned by status). In
48
3.8. PROBE AND CANCEL |
65 |
particular, the user may allocate memory for the receive bu er, according to the length of the probed message.
The MPI_CANCEL operation allows pending communications to be canceled. This is required for cleanup. Posting a send or a receive ties up user resources (send or receive bu ers), and a cancel may be needed to free these resources gracefully.
MPI_IPROBE(source, tag, comm, ag, status)
IN |
source |
rank of source or MPI_ANY_SOURCE (integer) |
IN |
tag |
message tag or MPI_ANY_TAG (integer) |
IN |
comm |
communicator (handle) |
OUT |
ag |
(logical) |
OUT |
status |
status object (Status) |
int MPI_Iprobe(int source, int tag, MPI_Comm comm, int *flag, MPI_Status *status)
MPI_IPROBE(SOURCE, TAG, COMM, FLAG, STATUS, IERROR)
LOGICAL FLAG
INTEGER SOURCE, TAG, COMM, STATUS(MPI_STATUS_SIZE), IERROR
fbool MPI::Comm::Iprobe(int source, int tag, MPI::Status& status) const
(binding deprecated, see Section 15.2) g
fbool MPI::Comm::Iprobe(int source, int tag) const (binding deprecated, see Section 15.2) g
MPI_IPROBE(source, tag, comm, ag, status) returns ag = true if there is a message that can be received and that matches the pattern speci ed by the arguments source, tag, and comm. The call matches the same message that would have been received by a call to MPI_RECV(..., source, tag, comm, status) executed at the same point in the program, and returns in status the same value that would have been returned by MPI_RECV(). Otherwise, the call returns ag = false, and leaves status unde ned.
If MPI_IPROBE returns ag = true, then the content of the status object can be subsequently accessed as described in Section 3.2.5 to nd the source, tag and length of the probed message.
A subsequent receive executed with the same communicator, and the source and tag returned in status by MPI_IPROBE will receive the message that was matched by the probe, if no other intervening receive occurs after the probe, and the send is not successfully cancelled before the receive. If the receiving process is multi-threaded, it is the user's responsibility to ensure that the last condition holds.
The source argument of MPI_PROBE can be MPI_ANY_SOURCE, and the tag argument can be MPI_ANY_TAG, so that one can probe for messages from an arbitrary source and/or with an arbitrary tag. However, a speci c communication context must be provided with the comm argument.
It is not necessary to receive a message immediately after it has been probed for, and the same message may be probed for several times before it is received.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
66 |
CHAPTER 3. POINT-TO-POINT COMMUNICATION |
1
2
3
4
5
6
7
8
MPI_PROBE(source, tag, comm, status)
IN |
source |
rank of source or MPI_ANY_SOURCE (integer) |
IN |
tag |
message tag or MPI_ANY_TAG (integer) |
IN |
comm |
communicator (handle) |
OUT |
status |
status object (Status) |
9int MPI_Probe(int source, int tag, MPI_Comm comm, MPI_Status *status)
10MPI_PROBE(SOURCE, TAG, COMM, STATUS, IERROR)
11INTEGER SOURCE, TAG, COMM, STATUS(MPI_STATUS_SIZE), IERROR
12 |
fvoid MPI::Comm::Probe(int source, int tag, MPI::Status& status) const |
13 |
|
14 |
(binding deprecated, see Section 15.2) g |
15 |
fvoid MPI::Comm::Probe(int source, int tag) const (binding deprecated, see |
16 |
|
17 |
Section 15.2) g |
18MPI_PROBE behaves like MPI_IPROBE except that it is a blocking call that returns
19only after a matching message has been found.
20The MPI implementation of MPI_PROBE and MPI_IPROBE needs to guarantee progress:
21if a call to MPI_PROBE has been issued by a process, and a send that matches the probe
22has been initiated by some process, then the call to MPI_PROBE will return, unless the
23message is received by another concurrent receive operation (that is executed by another
24thread at the probing process). Similarly, if a process busy waits with MPI_IPROBE and
25a matching message has been issued, then the call to MPI_IPROBE will eventually return
26ag = true unless the message is received by another concurrent receive operation.
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
Example 3.18 Use blocking probe to wait for an incoming message.
|
CALL MPI_COMM_RANK(comm, rank, ierr) |
|
IF (rank.EQ.0) THEN |
|
CALL MPI_SEND(i, 1, MPI_INTEGER, 2, 0, comm, ierr) |
|
ELSE IF (rank.EQ.1) THEN |
|
CALL MPI_SEND(x, 1, MPI_REAL, 2, 0, comm, ierr) |
|
ELSE IF (rank.EQ.2) THEN |
|
DO i=1, 2 |
|
CALL MPI_PROBE(MPI_ANY_SOURCE, 0, |
|
comm, status, ierr) |
|
IF (status(MPI_SOURCE) .EQ. 0) THEN |
100 |
CALL MPI_RECV(i, 1, MPI_INTEGER, 0, 0, comm, status, ierr) |
|
ELSE |
200 |
CALL MPI_RECV(x, 1, MPI_REAL, 1, 0, comm, status, ierr) |
|
END IF |
END DO
END IF
Each message is received with the right type.
Example 3.19 A similar program to the previous example, but now it has a problem.
3.8. PROBE AND CANCEL |
67 |
|
|
CALL MPI_COMM_RANK(comm, rank, ierr) |
|
|
IF (rank.EQ.0) THEN |
|
|
CALL MPI_SEND(i, 1, MPI_INTEGER, 2, 0, comm, ierr) |
|
|
ELSE IF (rank.EQ.1) THEN |
|
|
CALL MPI_SEND(x, 1, MPI_REAL, 2, 0, comm, ierr) |
|
|
ELSE IF (rank.EQ.2) THEN |
|
|
DO i=1, 2 |
|
|
CALL MPI_PROBE(MPI_ANY_SOURCE, 0, |
|
|
comm, status, ierr) |
|
|
IF (status(MPI_SOURCE) .EQ. 0) THEN |
|
100 |
CALL MPI_RECV(i, 1, MPI_INTEGER, MPI_ANY_SOURCE, |
|
|
0, comm, status, ierr) |
|
|
ELSE |
|
200 |
CALL MPI_RECV(x, 1, MPI_REAL, MPI_ANY_SOURCE, |
|
|
0, comm, status, ierr) |
|
END IF
END DO
END IF
We slightly modi ed Example 3.18, using MPI_ANY_SOURCE as the source argument in the two receive calls in statements labeled 100 and 200. The program is now incorrect: the receive operation may receive a message that is distinct from the message probed by the preceding call to MPI_PROBE.
Advice to implementors. A call to MPI_PROBE(source, tag, comm, status) will match the message that would have been received by a call to MPI_RECV(..., source, tag, comm, status) executed at the same point. Suppose that this message has source s, tag t and communicator c. If the tag argument in the probe call has value MPI_ANY_TAG then the message probed will be the earliest pending message from source s with communicator c and any tag; in any case, the message probed will be the earliest pending message from source s with tag t and communicator c (this is the message that would have been received, so as to preserve message order). This message continues as the earliest pending message from source s with tag t and communicator c, until it is received. A receive operation subsequent to the probe that uses the same communicator as the probe and uses the tag and source values returned by the probe, must receive this message, unless it has already been received by another receive operation. (End of advice to implementors.)
MPI_CANCEL(request)
IN |
request |
communication request (handle) |
int MPI_Cancel(MPI_Request *request)
MPI_CANCEL(REQUEST, IERROR)
INTEGER REQUEST, IERROR
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
fvoid MPI::Request::Cancel() const (binding deprecated, see Section 15.2) g |
48 |
|
68 |
CHAPTER 3. POINT-TO-POINT COMMUNICATION |
1A call to MPI_CANCEL marks for cancellation a pending, nonblocking communication
2operation (send or receive). The cancel call is local. It returns immediately, possibly before
3the communication is actually canceled. It is still necessary to complete a communication
4that has been marked for cancellation, using a call to MPI_REQUEST_FREE, MPI_WAIT or
5MPI_TEST (or any of the derived operations).
6If a communication is marked for cancellation, then a MPI_WAIT call for that com-
7munication is guaranteed to return, irrespective of the activities of other processes (i.e.,
8MPI_WAIT behaves as a local function); similarly if MPI_TEST is repeatedly called in a
9busy wait loop for a canceled communication, then MPI_TEST will eventually be success-
10ful.
11MPI_CANCEL can be used to cancel a communication that uses a persistent request (see
12Section 3.9), in the same way it is used for nonpersistent requests. A successful cancellation
13cancels the active communication, but not the request itself. After the call to MPI_CANCEL
14and the subsequent call to MPI_WAIT or MPI_TEST, the request becomes inactive and can
15be activated for a new communication.
16The successful cancellation of a bu ered send frees the bu er space occupied by the
17pending message.
18Either the cancellation succeeds, or the communication succeeds, but not both. If a
19send is marked for cancellation, then it must be the case that either the send completes
20normally, in which case the message sent was received at the destination process, or that
21the send is successfully canceled, in which case no part of the message was received at the
22destination. Then, any matching receive has to be satis ed by another send. If a receive is
23marked for cancellation, then it must be the case that either the receive completes normally,
24or that the receive is successfully canceled, in which case no part of the receive bu er is
25altered. Then, any matching send has to be satis ed by another receive.
26If the operation has been canceled, then information to that e ect will be returned in
27the status argument of the operation that completes the communication.
28
29Rationale. Although the IN request handle parameter should not need to be passed
30by reference, the C binding has listed the argument type as MPI_Request* since MPI-
311.0. This function signature therefore cannot be changed without breaking existing
32MPI applications. (End of rationale.)
33
34
35
36
37
38
39
40
41
MPI_TEST_CANCELLED(status, ag)
IN |
status |
status object (Status) |
OUT |
ag |
(logical) |
int MPI_Test_cancelled(MPI_Status *status, int *flag)
42MPI_TEST_CANCELLED(STATUS, FLAG, IERROR)
43LOGICAL FLAG
44INTEGER STATUS(MPI_STATUS_SIZE), IERROR
45
fbool MPI::Status::Is_cancelled() const (binding deprecated, see Section 15.2) g
46
47Returns ag = true if the communication associated with the status object was canceled
48successfully. In such a case, all other elds of status (such as count or tag) are unde ned.