- •Contents
- •List of Figures
- •List of Tables
- •Acknowledgments
- •Introduction to MPI
- •Overview and Goals
- •Background of MPI-1.0
- •Background of MPI-1.1, MPI-1.2, and MPI-2.0
- •Background of MPI-1.3 and MPI-2.1
- •Background of MPI-2.2
- •Who Should Use This Standard?
- •What Platforms Are Targets For Implementation?
- •What Is Included In The Standard?
- •What Is Not Included In The Standard?
- •Organization of this Document
- •MPI Terms and Conventions
- •Document Notation
- •Naming Conventions
- •Semantic Terms
- •Data Types
- •Opaque Objects
- •Array Arguments
- •State
- •Named Constants
- •Choice
- •Addresses
- •Language Binding
- •Deprecated Names and Functions
- •Fortran Binding Issues
- •C Binding Issues
- •C++ Binding Issues
- •Functions and Macros
- •Processes
- •Error Handling
- •Implementation Issues
- •Independence of Basic Runtime Routines
- •Interaction with Signals
- •Examples
- •Point-to-Point Communication
- •Introduction
- •Blocking Send and Receive Operations
- •Blocking Send
- •Message Data
- •Message Envelope
- •Blocking Receive
- •Return Status
- •Passing MPI_STATUS_IGNORE for Status
- •Data Type Matching and Data Conversion
- •Type Matching Rules
- •Type MPI_CHARACTER
- •Data Conversion
- •Communication Modes
- •Semantics of Point-to-Point Communication
- •Buffer Allocation and Usage
- •Nonblocking Communication
- •Communication Request Objects
- •Communication Initiation
- •Communication Completion
- •Semantics of Nonblocking Communications
- •Multiple Completions
- •Non-destructive Test of status
- •Probe and Cancel
- •Persistent Communication Requests
- •Send-Receive
- •Null Processes
- •Datatypes
- •Derived Datatypes
- •Type Constructors with Explicit Addresses
- •Datatype Constructors
- •Subarray Datatype Constructor
- •Distributed Array Datatype Constructor
- •Address and Size Functions
- •Lower-Bound and Upper-Bound Markers
- •Extent and Bounds of Datatypes
- •True Extent of Datatypes
- •Commit and Free
- •Duplicating a Datatype
- •Use of General Datatypes in Communication
- •Correct Use of Addresses
- •Decoding a Datatype
- •Examples
- •Pack and Unpack
- •Canonical MPI_PACK and MPI_UNPACK
- •Collective Communication
- •Introduction and Overview
- •Communicator Argument
- •Applying Collective Operations to Intercommunicators
- •Barrier Synchronization
- •Broadcast
- •Example using MPI_BCAST
- •Gather
- •Examples using MPI_GATHER, MPI_GATHERV
- •Scatter
- •Examples using MPI_SCATTER, MPI_SCATTERV
- •Example using MPI_ALLGATHER
- •All-to-All Scatter/Gather
- •Global Reduction Operations
- •Reduce
- •Signed Characters and Reductions
- •MINLOC and MAXLOC
- •All-Reduce
- •Process-local reduction
- •Reduce-Scatter
- •MPI_REDUCE_SCATTER_BLOCK
- •MPI_REDUCE_SCATTER
- •Scan
- •Inclusive Scan
- •Exclusive Scan
- •Example using MPI_SCAN
- •Correctness
- •Introduction
- •Features Needed to Support Libraries
- •MPI's Support for Libraries
- •Basic Concepts
- •Groups
- •Contexts
- •Intra-Communicators
- •Group Management
- •Group Accessors
- •Group Constructors
- •Group Destructors
- •Communicator Management
- •Communicator Accessors
- •Communicator Constructors
- •Communicator Destructors
- •Motivating Examples
- •Current Practice #1
- •Current Practice #2
- •(Approximate) Current Practice #3
- •Example #4
- •Library Example #1
- •Library Example #2
- •Inter-Communication
- •Inter-communicator Accessors
- •Inter-communicator Operations
- •Inter-Communication Examples
- •Caching
- •Functionality
- •Communicators
- •Windows
- •Datatypes
- •Error Class for Invalid Keyval
- •Attributes Example
- •Naming Objects
- •Formalizing the Loosely Synchronous Model
- •Basic Statements
- •Models of Execution
- •Static communicator allocation
- •Dynamic communicator allocation
- •The General case
- •Process Topologies
- •Introduction
- •Virtual Topologies
- •Embedding in MPI
- •Overview of the Functions
- •Topology Constructors
- •Cartesian Constructor
- •Cartesian Convenience Function: MPI_DIMS_CREATE
- •General (Graph) Constructor
- •Distributed (Graph) Constructor
- •Topology Inquiry Functions
- •Cartesian Shift Coordinates
- •Partitioning of Cartesian structures
- •Low-Level Topology Functions
- •An Application Example
- •MPI Environmental Management
- •Implementation Information
- •Version Inquiries
- •Environmental Inquiries
- •Tag Values
- •Host Rank
- •IO Rank
- •Clock Synchronization
- •Memory Allocation
- •Error Handling
- •Error Handlers for Communicators
- •Error Handlers for Windows
- •Error Handlers for Files
- •Freeing Errorhandlers and Retrieving Error Strings
- •Error Codes and Classes
- •Error Classes, Error Codes, and Error Handlers
- •Timers and Synchronization
- •Startup
- •Allowing User Functions at Process Termination
- •Determining Whether MPI Has Finished
- •Portable MPI Process Startup
- •The Info Object
- •Process Creation and Management
- •Introduction
- •The Dynamic Process Model
- •Starting Processes
- •The Runtime Environment
- •Process Manager Interface
- •Processes in MPI
- •Starting Processes and Establishing Communication
- •Reserved Keys
- •Spawn Example
- •Manager-worker Example, Using MPI_COMM_SPAWN.
- •Establishing Communication
- •Names, Addresses, Ports, and All That
- •Server Routines
- •Client Routines
- •Name Publishing
- •Reserved Key Values
- •Client/Server Examples
- •Ocean/Atmosphere - Relies on Name Publishing
- •Simple Client-Server Example.
- •Other Functionality
- •Universe Size
- •Singleton MPI_INIT
- •MPI_APPNUM
- •Releasing Connections
- •Another Way to Establish MPI Communication
- •One-Sided Communications
- •Introduction
- •Initialization
- •Window Creation
- •Window Attributes
- •Communication Calls
- •Examples
- •Accumulate Functions
- •Synchronization Calls
- •Fence
- •General Active Target Synchronization
- •Lock
- •Assertions
- •Examples
- •Error Handling
- •Error Handlers
- •Error Classes
- •Semantics and Correctness
- •Atomicity
- •Progress
- •Registers and Compiler Optimizations
- •External Interfaces
- •Introduction
- •Generalized Requests
- •Examples
- •Associating Information with Status
- •MPI and Threads
- •General
- •Initialization
- •Introduction
- •File Manipulation
- •Opening a File
- •Closing a File
- •Deleting a File
- •Resizing a File
- •Preallocating Space for a File
- •Querying the Size of a File
- •Querying File Parameters
- •File Info
- •Reserved File Hints
- •File Views
- •Data Access
- •Data Access Routines
- •Positioning
- •Synchronism
- •Coordination
- •Data Access Conventions
- •Data Access with Individual File Pointers
- •Data Access with Shared File Pointers
- •Noncollective Operations
- •Collective Operations
- •Seek
- •Split Collective Data Access Routines
- •File Interoperability
- •Datatypes for File Interoperability
- •Extent Callback
- •Datarep Conversion Functions
- •Matching Data Representations
- •Consistency and Semantics
- •File Consistency
- •Random Access vs. Sequential Files
- •Progress
- •Collective File Operations
- •Type Matching
- •Logical vs. Physical File Layout
- •File Size
- •Examples
- •Asynchronous I/O
- •I/O Error Handling
- •I/O Error Classes
- •Examples
- •Subarray Filetype Constructor
- •Requirements
- •Discussion
- •Logic of the Design
- •Examples
- •MPI Library Implementation
- •Systems with Weak Symbols
- •Systems Without Weak Symbols
- •Complications
- •Multiple Counting
- •Linker Oddities
- •Multiple Levels of Interception
- •Deprecated Functions
- •Deprecated since MPI-2.0
- •Deprecated since MPI-2.2
- •Language Bindings
- •Overview
- •Design
- •C++ Classes for MPI
- •Class Member Functions for MPI
- •Semantics
- •C++ Datatypes
- •Communicators
- •Exceptions
- •Mixed-Language Operability
- •Problems With Fortran Bindings for MPI
- •Problems Due to Strong Typing
- •Problems Due to Data Copying and Sequence Association
- •Special Constants
- •Fortran 90 Derived Types
- •A Problem with Register Optimization
- •Basic Fortran Support
- •Extended Fortran Support
- •The mpi Module
- •No Type Mismatch Problems for Subroutines with Choice Arguments
- •Additional Support for Fortran Numeric Intrinsic Types
- •Language Interoperability
- •Introduction
- •Assumptions
- •Initialization
- •Transfer of Handles
- •Status
- •MPI Opaque Objects
- •Datatypes
- •Callback Functions
- •Error Handlers
- •Reduce Operations
- •Addresses
- •Attributes
- •Extra State
- •Constants
- •Interlanguage Communication
- •Language Bindings Summary
- •Groups, Contexts, Communicators, and Caching Fortran Bindings
- •External Interfaces C++ Bindings
- •Change-Log
- •Bibliography
- •Examples Index
- •MPI Declarations Index
- •MPI Function Index
13.6. CONSISTENCY AND SEMANTICS |
441 |
of le. (End of rationale.)
Finally, for some sequential les, such as those corresponding to magnetic tapes or streaming network connections, writes to the le may be destructive. In other words, a write may act as a truncate (a MPI_FILE_SET_SIZE with size set to the current position) followed by the write.
13.6.3 Progress
The progress rules of MPI are both a promise to users and a set of constraints on implementors. In cases where the progress rules restrict possible implementation choices more than the interface speci cation alone, the progress rules take precedence.
All blocking routines must complete in nite time unless an exceptional condition (such as resource exhaustion) causes an error.
Nonblocking data access routines inherit the following progress rule from nonblocking point to point communication: a nonblocking write is equivalent to a nonblocking send for which a receive is eventually posted, and a nonblocking read is equivalent to a nonblocking receive for which a send is eventually posted.
Finally, an implementation is free to delay progress of collective routines until all processes in the group associated with the collective call have invoked the routine. Once all processes in the group have invoked the routine, the progress rule of the equivalent noncollective routine must be followed.
13.6.4 Collective File Operations
Collective le operations are subject to the same restrictions as collective communication operations. For a complete discussion, please refer to the semantics set forth in Section 5.12 on page 183.
Collective le operations are collective over a dup of the communicator used to open the le|this duplicate communicator is implicitly speci ed via the le handle argument. Di erent processes can pass di erent values for other arguments of a collective routine unless speci ed otherwise.
13.6.5 Type Matching
The type matching rules for I/O mimic the type matching rules for communication with one exception: if etype is MPI_BYTE, then this matches any datatype in a data access operation. In general, the etype of data items written must match the etype used to read the items, and for each data access operation, the current etype must also match the type declaration of the data access bu er.
Advice to users. In most cases, use of MPI_BYTE as a wild card will defeat thele interoperability features of MPI. File interoperability can only perform automatic conversion between heterogeneous data representations when the exact datatypes accessed are explicitly speci ed. (End of advice to users.)
13.6.6 Miscellaneous Clari cations
Once an I/O routine completes, it is safe to free any opaque objects passed as arguments to that routine. For example, the comm and info used in an MPI_FILE_OPEN, or the etype
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
442 |
CHAPTER 13. I/O |
1and letype used in an MPI_FILE_SET_VIEW, can be freed without a ecting access to the
2le. Note that for nonblocking routines and split collective operations, the operation must
3be completed before it is safe to reuse data bu ers passed as arguments.
4As in communication, datatypes must be committed before they can be used in le
5manipulation or data access operations. For example, the etype and letype must be com-
6mitted before calling MPI_FILE_SET_VIEW, and the datatype must be committed before
7
8
9
10
calling MPI_FILE_READ or MPI_FILE_WRITE.
13.6.7 MPI_O set Type
11MPI_O set is an integer type of size su cient to represent the size (in bytes) of the largest
12le supported by MPI. Displacements and o sets are always speci ed as values of type
13MPI_O set.
14In Fortran, the corresponding integer is an integer of kind MPI_OFFSET_KIND, de ned
15in mpif.h and the mpi module.
16In Fortran 77 environments that do not support KIND parameters, MPI_O set argu-
17ments should be declared as an INTEGER of suitable size. The language interoperability
18implications for MPI_O set are similar to those for addresses (see Section 16.3, page 497).
19 |
|
|
|
20 |
13.6.8 |
Logical vs. Physical File Layout |
|
21 |
MPI speci es how the data should be laid out in a virtual le structure (the view), not |
||
22 |
|||
how that le structure is to be stored on one or more disks. Speci cation of the physical |
|||
23 |
|||
le structure was avoided because it is expected that the mapping of les to disks will be |
|||
24 |
|||
system speci c, and any speci c control over le layout would therefore restrict program |
|||
25 |
|||
portability. However, there are still cases where some information may be necessary to |
|||
26 |
|||
optimize le layout. This information can be provided as hints speci ed via info when a le |
|||
27 |
|||
is created (see Section 13.2.8, page 398). |
|||
28 |
|||
|
|
||
29 |
13.6.9 |
File Size |
|
30 |
|||
|
|
31The size of a le may be increased by writing to the le after the current end of le. The size
32may also be changed by calling MPI size changing routines, such as MPI_FILE_SET_SIZE. A
33call to a size changing routine does not necessarily change the le size. For example, calling
34MPI_FILE_PREALLOCATE with a size less than the current size does not change the size.
35Consider a set of bytes that has been written to a le since the most recent call to a
36size changing routine, or since MPI_FILE_OPEN if no such routine has been called. Let the
37high byte be the byte in that set with the largest displacement. The le size is the larger of
38
39
40
41
One plus the displacement of the high byte.
The size immediately after the size changing routine, or MPI_FILE_OPEN, returned.
42When applying consistency semantics, calls to MPI_FILE_SET_SIZE and
43MPI_FILE_PREALLOCATE are considered writes to the le (which con ict with operations
44that access bytes at displacements between the old and new le sizes), and
45MPI_FILE_GET_SIZE is considered a read of the le (which overlaps with all accesses to the
46le).
47
48