Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
ИНСАЙД ИНФА MPI.pdf
Скачиваний:
15
Добавлен:
15.04.2015
Размер:
3.3 Mб
Скачать

404

CHAPTER 13. I/O

1typemaps equal to the typemaps of the current etype and letype, respectively.

2The data representation is returned in datarep. The user is responsible for ensuring

3that datarep is large enough to hold the returned data representation string. The length of

4a data representation string is limited to the value of MPI_MAX_DATAREP_STRING.

5In addition, if a portable datatype was used to set the current view, then the corre-

6sponding datatype returned by MPI_FILE_GET_VIEW is also a portable datatype. If etype

7or letype are derived datatypes, the user is responsible for freeing them. The etype and

8

9

10

11

12

13

letype returned are both in a committed state.

13.4 Data Access

13.4.1 Data Access Routines

14Data is moved between les and processes by issuing read and write calls. There are three

15orthogonal aspects to data access: positioning (explicit o set vs. implicit le pointer),

16synchronism (blocking vs. nonblocking and split collective), and coordination (noncollective

17vs. collective). The following combinations of these data access routines, including two types

18of le pointers (individual and shared) are provided in Table 13.1.

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

positioning

synchronism

coordination

 

 

noncollective

collective

 

 

 

 

explicit

blocking

MPI_FILE_READ_AT

MPI_FILE_READ_AT_ALL

o sets

 

MPI_FILE_WRITE_AT

MPI_FILE_WRITE_AT_ALL

 

nonblocking &

MPI_FILE_IREAD_AT

MPI_FILE_READ_AT_ALL_BEGIN

 

split collective

 

MPI_FILE_READ_AT_ALL_END

 

 

MPI_FILE_IWRITE_AT

MPI_FILE_WRITE_AT_ALL_BEGIN

 

 

 

MPI_FILE_WRITE_AT_ALL_END

individual

blocking

MPI_FILE_READ

MPI_FILE_READ_ALL

le pointers

 

MPI_FILE_WRITE

MPI_FILE_WRITE_ALL

 

 

 

 

 

nonblocking &

MPI_FILE_IREAD

MPI_FILE_READ_ALL_BEGIN

 

split collective

 

MPI_FILE_READ_ALL_END

 

 

MPI_FILE_IWRITE

MPI_FILE_WRITE_ALL_BEGIN

 

 

 

MPI_FILE_WRITE_ALL_END

shared

blocking

MPI_FILE_READ_SHARED

MPI_FILE_READ_ORDERED

le pointer

 

MPI_FILE_WRITE_SHARED

MPI_FILE_WRITE_ORDERED

 

 

 

 

 

nonblocking &

MPI_FILE_IREAD_SHARED

MPI_FILE_READ_ORDERED_BEGIN

 

split collective

 

MPI_FILE_READ_ORDERED_END

 

 

MPI_FILE_IWRITE_SHARED

MPI_FILE_WRITE_ORDERED_BEGIN

 

 

 

MPI_FILE_WRITE_ORDERED_END

Table 13.1: Data access routines

41POSIX read()/fread() and write()/fwrite() are blocking, noncollective operations and

42use individual le pointers. The MPI equivalents are MPI_FILE_READ and

43MPI_FILE_WRITE.

44Implementations of data access routines may bu er data to improve performance. This

45does not a ect reads, as the data is always available in the user's bu er after a read operation

46completes. For writes, however, the MPI_FILE_SYNC routine provides the only guarantee

47that data has been transferred to the storage device.

48

13.4. DATA ACCESS

405

Positioning

MPI provides three types of positioning for data access routines: explicit o sets, individualle pointers, and shared le pointers. The di erent positioning methods may be mixed within the same program and do not a ect each other.

The data access routines that accept explicit o sets contain _AT in their name (e.g., MPI_FILE_WRITE_AT). Explicit o set operations perform data access at the le position given directly as an argument|no le pointer is used nor updated. Note that this is not equivalent to an atomic seek-and-read or seek-and-write operation, as no \seek" is issued. Operations with explicit o sets are described in Section 13.4.2, page 407.

The names of the individual le pointer routines contain no positional quali er (e.g., MPI_FILE_WRITE). Operations with individual le pointers are described in Section 13.4.3, page 410. The data access routines that use shared le pointers contain _SHARED or _ORDERED in their name (e.g., MPI_FILE_WRITE_SHARED). Operations with shared le pointers are described in Section 13.4.4, page 416.

The main semantic issues with MPI-maintained le pointers are how and when they are updated by I/O operations. In general, each I/O operation leaves the le pointer pointing to the next data item after the last one that is accessed by the operation. In a nonblocking or split collective operation, the pointer is updated by the call that initiates the I/O, possibly before the access completes.

More formally,

new_ le_o set = old_ le_o set + elements(datatype) count elements(etype)

where count is the number of datatype items to be accessed, elements(X) is the number of prede ned datatypes in the typemap of X, and old_ le_o set is the value of the implicit o set before the call. The le position, new_ le_o set, is in terms of a count of etypes relative to the current view.

Synchronism

MPI supports blocking and nonblocking I/O routines.

A blocking I/O call will not return until the I/O request is completed.

A nonblocking I/O call initiates an I/O operation, but does not wait for it to complete. Given suitable hardware, this allows the transfer of data out/in the user's bu er to proceed concurrently with computation. A separate request complete call (MPI_WAIT, MPI_TEST, or any of their variants) is needed to complete the I/O request, i.e., to con rm that the data has been read or written and that it is safe for the user to reuse the bu er. The nonblocking versions of the routines are named MPI_FILE_IXXX, where the I stands for immediate.

It is erroneous to access the local bu er of a nonblocking data access operation, or to use that bu er as the source or target of other communications, between the initiation and completion of the operation.

The split collective routines support a restricted form of \nonblocking" operations for collective data access (see Section 13.4.5, page 421).

Coordination

Every noncollective data access routine MPI_FILE_XXX has a collective counterpart. For most routines, this counterpart is MPI_FILE_XXX_ALL or a pair of MPI_FILE_XXX_BEGIN

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

406

CHAPTER 13. I/O

1and MPI_FILE_XXX_END. The counterparts to the MPI_FILE_XXX_SHARED routines are

2MPI_FILE_XXX_ORDERED.

3The completion of a noncollective call only depends on the activity of the calling pro-

4cess. However, the completion of a collective call (which must be called by all members of

5the process group) may depend on the activity of the other processes participating in the

6collective call. See Section 13.6.4, page 441, for rules on semantics of collective calls.

7Collective operations may perform much better than their noncollective counterparts,

8as global data accesses have signi cant potential for automatic optimization.

9

10

Data Access Conventions

 

11

 

12Data is moved between les and processes by calling read and write routines. Read routines

13move data from a le into memory. Write routines move data from memory into a le. The

14le is designated by a le handle, fh. The location of the le data is speci ed by an o set

15into the current view. The data in memory is speci ed by a triple: buf, count, and datatype.

16Upon completion, the amount of data accessed by the calling process is returned in a status.

17An o set designates the starting position in the le for an access. The o set is always in

18etype units relative to the current view. Explicit o set routines pass o set as an argument

19(negative values are erroneous). The le pointer routines use implicit o sets maintained by

20MPI.

21A data access routine attempts to transfer (read or write) count data items of type

22datatype between the user's bu er buf and the le. The datatype passed to the routine

23must be a committed datatype. The layout of data in memory corresponding to buf, count,

24datatype is interpreted the same way as in MPI communication functions; see Section 3.2.2

25on page 27 and Section 4.1.11 on page 101. The data is accessed from those parts of the

26le speci ed by the current view (Section 13.3, page 401). The type signature of datatype

27must match the type signature of some number of contiguous copies of the etype of the

28current view. As in a receive, it is erroneous to specify a datatype for reading that contains

29overlapping regions (areas of memory which would be stored into more than once).

30The nonblocking data access routines indicate that MPI can start a data access and

31associate a request handle, request, with the I/O operation. Nonblocking operations are

32completed via MPI_TEST, MPI_WAIT, or any of their variants.

33Data access operations, when completed, return the amount of data accessed in status.

34

35

36

37

38

Advice to users. To prevent problems with the argument copying and register optimization done by Fortran compilers, please note the hints in subsections \Problems Due to Data Copying and Sequence Association," and \A Problem with Register Optimization" in Section 16.2.2, pages 482 and 485. (End of advice to users.)

39For blocking routines, status is returned directly. For nonblocking routines and split

40collective routines, status is returned when the operation is completed. The number of

41datatype entries and prede ned elements accessed by the calling process can be extracted

42from status by using MPI_GET_COUNT and MPI_GET_ELEMENTS, respectively. The inter-

43pretation of the MPI_ERROR eld is the same as for other operations | normally unde ned,

44but meaningful if an MPI routine returns MPI_ERR_IN_STATUS. The user can pass (in C

45and Fortran) MPI_STATUS_IGNORE in the status argument if the return value of this argu-

46ment is not needed. In C++, the status argument is optional. The status can be passed

47to MPI_TEST_CANCELLED to determine if the operation was cancelled. All other elds of

48status are unde ned.

13.4. DATA ACCESS

407

When reading, a program can detect the end of le by noting that the amount of data read is less than the amount requested. Writing past the end of le increases the le size. The amount of data accessed will be the amount requested, unless an error is raised (or a read reaches the end of le).

13.4.2 Data Access with Explicit O sets

If MPI_MODE_SEQUENTIAL mode was speci ed when the le was opened, it is erroneous to call the routines in this section.

MPI_FILE_READ_AT(fh, o set, buf, count, datatype, status)

IN

fh

le handle (handle)

IN

o set

le o set (integer)

OUT

buf

initial address of bu er (choice)

IN

count

number of elements in bu er (integer)

IN

datatype

datatype of each bu er element (handle)

OUT

status

status object (Status)

int MPI_File_read_at(MPI_File fh, MPI_Offset offset, void *buf, int count, MPI_Datatype datatype, MPI_Status *status)

MPI_FILE_READ_AT(FH, OFFSET, BUF, COUNT, DATATYPE, STATUS, IERROR) <type> BUF(*)

INTEGER FH, COUNT, DATATYPE, STATUS(MPI_STATUS_SIZE), IERROR INTEGER(KIND=MPI_OFFSET_KIND) OFFSET

fvoid MPI::File::Read_at(MPI::Offset offset, void* buf, int count,

const MPI::Datatype& datatype, MPI::Status& status) (binding deprecated, see Section 15.2) g

fvoid MPI::File::Read_at(MPI::Offset offset, void* buf, int count, const MPI::Datatype& datatype) (binding deprecated, see Section 15.2) g

MPI_FILE_READ_AT reads a le beginning at the position speci ed by o set.

MPI_FILE_READ_AT_ALL(fh, o set, buf, count, datatype, status)

IN

fh

le handle (handle)

IN

o set

le o set (integer)

OUT

buf

initial address of bu er (choice)

IN

count

number of elements in bu er (integer)

IN

datatype

datatype of each bu er element (handle)

OUT

status

status object (Status)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

44
45
46
47
48

408

CHAPTER 13. I/O

1int MPI_File_read_at_all(MPI_File fh, MPI_Offset offset, void *buf,

2

3

int count, MPI_Datatype datatype, MPI_Status *status)

4MPI_FILE_READ_AT_ALL(FH, OFFSET, BUF, COUNT, DATATYPE, STATUS, IERROR)

5<type> BUF(*)

6INTEGER FH, COUNT, DATATYPE, STATUS(MPI_STATUS_SIZE), IERROR

7

INTEGER(KIND=MPI_OFFSET_KIND) OFFSET

 

8fvoid MPI::File::Read_at_all(MPI::Offset offset, void* buf, int count,

9

10

11

const MPI::Datatype& datatype, MPI::Status& status) (binding deprecated, see Section 15.2) g

12

fvoid MPI::File::Read_at_all(MPI::Offset offset, void* buf, int count,

13

const MPI::Datatype& datatype) (binding deprecated, see

14

Section 15.2) g

15MPI_FILE_READ_AT_ALL is a collective version of the blocking MPI_FILE_READ_AT

16interface.

17

18

19

MPI_FILE_WRITE_AT(fh, o set, buf, count, datatype, status)

20

INOUT

fh

le handle (handle)

 

21

 

 

le o set (integer)

22

IN

o set

 

 

 

23

IN

buf

initial address of bu er (choice)

24

IN

count

number of elements in bu er (integer)

 

25

 

 

datatype of each bu er element (handle)

26

IN

datatype

 

 

 

27

OUT

status

status object (Status)

28

 

 

 

29

int MPI_File_write_at(MPI_File fh, MPI_Offset offset, void *buf, int count,

 

30

MPI_Datatype datatype, MPI_Status *status)

 

31

 

32MPI_FILE_WRITE_AT(FH, OFFSET, BUF, COUNT, DATATYPE, STATUS, IERROR)

33<type> BUF(*)

34INTEGER FH, COUNT, DATATYPE, STATUS(MPI_STATUS_SIZE), IERROR

35INTEGER(KIND=MPI_OFFSET_KIND) OFFSET

36fvoid MPI::File::Write_at(MPI::Offset offset, const void* buf, int count,

37

38

39

const MPI::Datatype& datatype, MPI::Status& status) (binding deprecated, see Section 15.2) g

40

fvoid MPI::File::Write_at(MPI::Offset offset, const void* buf, int count,

41

const MPI::Datatype& datatype) (binding deprecated, see

42

Section 15.2) g

43

MPI_FILE_WRITE_AT writes a le beginning at the position speci ed by o set.

 

13.4. DATA ACCESS

409

MPI_FILE_WRITE_AT_ALL(fh, o set, buf, count, datatype, status)

INOUT

fh

le handle (handle)

IN

o set

le o set (integer)

IN

buf

initial address of bu er (choice)

IN

count

number of elements in bu er (integer)

IN

datatype

datatype of each bu er element (handle)

OUT

status

status object (Status)

int MPI_File_write_at_all(MPI_File fh, MPI_Offset offset, void *buf, int count, MPI_Datatype datatype, MPI_Status *status)

MPI_FILE_WRITE_AT_ALL(FH, OFFSET, BUF, COUNT, DATATYPE, STATUS, IERROR) <type> BUF(*)

INTEGER FH, COUNT, DATATYPE, STATUS(MPI_STATUS_SIZE), IERROR INTEGER(KIND=MPI_OFFSET_KIND) OFFSET

fvoid MPI::File::Write_at_all(MPI::Offset offset, const void* buf,

int count, const MPI::Datatype& datatype, MPI::Status& status)

(binding deprecated, see Section 15.2) g

fvoid MPI::File::Write_at_all(MPI::Offset offset, const void* buf,

int count, const MPI::Datatype& datatype) (binding deprecated, see Section 15.2) g

MPI_FILE_WRITE_AT_ALL is a collective version of the blocking

MPI_FILE_WRITE_AT interface.

MPI_FILE_IREAD_AT(fh, o set, buf, count, datatype, request)

IN

fh

le handle (handle)

IN

o set

le o set (integer)

OUT

buf

initial address of bu er (choice)

IN

count

number of elements in bu er (integer)

IN

datatype

datatype of each bu er element (handle)

OUT

request

request object (handle)

int MPI_File_iread_at(MPI_File fh, MPI_Offset offset, void *buf, int count, MPI_Datatype datatype, MPI_Request *request)

MPI_FILE_IREAD_AT(FH, OFFSET, BUF, COUNT, DATATYPE, REQUEST, IERROR) <type> BUF(*)

INTEGER FH, COUNT, DATATYPE, REQUEST, IERROR

INTEGER(KIND=MPI_OFFSET_KIND) OFFSET

fMPI::Request MPI::File::Iread_at(MPI::Offset offset, void* buf, int count, const MPI::Datatype& datatype) (binding deprecated, see

Section 15.2) g

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48