TY - JOUR
T1 - Evaluating Data Redistribution in PaRSEC
AU - Cao, Qinglei
AU - Bosilca, George
AU - Losada, Nuria
AU - Wu, Wei
AU - Zhong, Dong
AU - Dongarra, Jack
N1 - KAUST Repository Item: Exported on 2022-05-26
Acknowledgements: This work was supported in part by the Exascale Computing Project under Grant 17-SC-20-SC, a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. For computer time, this research used the Shaheen II supercomputer hosted at the Supercomputing Laboratory at KAUST.
The authors would like to thank Aurelien Bouteiller for multi-threaded supports in PaRSEC Cray Inc. and Intel in the context of the Cray Center of Excellence and Intel Parallel Computing Center awarded to the Extreme Computing Research Center at KAUST.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.
PY - 2021/11/30
Y1 - 2021/11/30
N2 - Data redistribution aims to reshuffle data to optimize some objective for an algorithm. The objective can be multi-dimensional, such as improving computational load balance or decreasing communication volume or cost, with the ultimate goal of increasing the efficiency and therefore reducing the time-to-solution for the algorithm. The classic redistribution problem focuses on optimally scheduling communications when reshuffling data between two regular, usually block-cyclic, data distributions. Besides distribution, data size is also a performance-critical parameter because it affects the reshuffling algorithm in terms of cache, communication efficiency, and potential parallelism. In addition, task-based runtime systems have gained popularity recently as a potential candidate to address the programming complexity on the way to exascale. In this scenario, it becomes paramount to develop a flexible redistribution algorithm for task-based runtime systems, which could support all types of regular and irregular data distributions and take data size into account. In this article, we detail a flexible redistribution algorithm and implement an efficient approach in a task-based runtime system, PaRSEC. Performance results show great capability compared to the theoretical bound and ScaLAPACK, and applications highlight an increased efficiency with little overhead in terms of data distribution, data size, and data format.
AB - Data redistribution aims to reshuffle data to optimize some objective for an algorithm. The objective can be multi-dimensional, such as improving computational load balance or decreasing communication volume or cost, with the ultimate goal of increasing the efficiency and therefore reducing the time-to-solution for the algorithm. The classic redistribution problem focuses on optimally scheduling communications when reshuffling data between two regular, usually block-cyclic, data distributions. Besides distribution, data size is also a performance-critical parameter because it affects the reshuffling algorithm in terms of cache, communication efficiency, and potential parallelism. In addition, task-based runtime systems have gained popularity recently as a potential candidate to address the programming complexity on the way to exascale. In this scenario, it becomes paramount to develop a flexible redistribution algorithm for task-based runtime systems, which could support all types of regular and irregular data distributions and take data size into account. In this article, we detail a flexible redistribution algorithm and implement an efficient approach in a task-based runtime system, PaRSEC. Performance results show great capability compared to the theoretical bound and ScaLAPACK, and applications highlight an increased efficiency with little overhead in terms of data distribution, data size, and data format.
UR - http://hdl.handle.net/10754/678245
UR - https://ieeexplore.ieee.org/document/9629320/
UR - http://www.scopus.com/inward/record.url?scp=85121846332&partnerID=8YFLogxK
U2 - 10.1109/TPDS.2021.3131657
DO - 10.1109/TPDS.2021.3131657
M3 - Article
SN - 1558-2183
VL - 33
SP - 1856
EP - 1872
JO - IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
JF - IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
IS - 8
ER -