TY - GEN
T1 - On overlapping communication and file I/O in collective write operation
AU - Feki, Raafat
AU - Gabriel, Edgar
N1 - KAUST Repository Item: Exported on 2022-06-30
Acknowledgements: Partial support for this work was provided by the National Science Foundation under Award No. SI2-SSI 1663887. The authors would also like to thank KAUST Supercomputing laboratory for providing compute time on the Ibex cluster for this project.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.
PY - 2020/7/28
Y1 - 2020/7/28
N2 - Many parallel scientific applications spend a significant amount of time reading and writing data files. Collective I/O operations allow to optimize the file access of a process group by redistributing data across processes to match the data layout on the file system. In most parallel I/O libraries, the implementation of collective I/O operations is based on the two-phase I/O algorithm, which consists of a communication phase and a file access phase. This papers evaluates various design options for overlapping two internal cycles of the two-phase I/O algorithm, and explores using different data transfer primitives for the shuffle phase, including non-blocking two-sided communication and multiple versions of one-sided communication. The results indicate that overlap algorithms incorporating asynchronous I/O outperform overlapping approaches that only rely on nonblocking communication. However, in the vast majority of the testcases one-sided communication did not lead to performance improvements over two-sided communication.
AB - Many parallel scientific applications spend a significant amount of time reading and writing data files. Collective I/O operations allow to optimize the file access of a process group by redistributing data across processes to match the data layout on the file system. In most parallel I/O libraries, the implementation of collective I/O operations is based on the two-phase I/O algorithm, which consists of a communication phase and a file access phase. This papers evaluates various design options for overlapping two internal cycles of the two-phase I/O algorithm, and explores using different data transfer primitives for the shuffle phase, including non-blocking two-sided communication and multiple versions of one-sided communication. The results indicate that overlap algorithms incorporating asynchronous I/O outperform overlapping approaches that only rely on nonblocking communication. However, in the vast majority of the testcases one-sided communication did not lead to performance improvements over two-sided communication.
UR - http://hdl.handle.net/10754/679506
UR - https://ieeexplore.ieee.org/document/9150353/
UR - http://www.scopus.com/inward/record.url?scp=85091576845&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW50202.2020.00175
DO - 10.1109/IPDPSW50202.2020.00175
M3 - Conference contribution
SN - 9781728174457
SP - 1044
EP - 1051
BT - 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
PB - IEEE
ER -