TY - GEN
T1 - Implications of Reduced Communication Precision in a Collocated Discontinuous Galerkin Finite Element Framework
AU - Rogowski, Marcin
AU - Dalcin, Lisandro
AU - Parsani, Matteo
AU - Keyes, David E.
N1 - KAUST Repository Item: Exported on 2022-10-07
Acknowledgements: The research reported in this paper was funded by King Abdullah University of Science and Technology. We are thankful to the Supercomputing Laboratory and the Extreme Computing Research Center at King Abdullah University of Science and Technology for their computing resources.
PY - 2021/9/20
Y1 - 2021/9/20
N2 - Compute capability of high-performance hardware has been growing at immense rates, increasing over 130x in the last decade. Communication bandwidth, however, only grew by a factor of 6x in the same time, leading to a significant decrease in the byte-To-flop metric. This trend leads us to the situation where, in many cases, computation is virtually free, and the dominant cost of a parallel application comes from its communication cost. We expect this trend to continue and, hence, the parallel application wall-clock time to be increasingly correlated with the amount of data transferred between the nodes involved. In order to alleviate this communication bottleneck, we test several communication-reducing schemes based on the idea of using higher precision for the inner cells and lower precision communication. For every approach, we report the resulting network traffic and weigh it against the decreased accuracy. We perform our experiments in a collocated Discontinuous Galerkin finite element method framework (DG-FEM) applied in Computational Fluid Dynamics (CFD). First, we present a parametric study using the method of manufactured solutions on a 3D compressible Navier-Stokes supersonic cube. Using this method allows us to quantify communication reducing schemes' impact on the error in test cases representing a range of solution polynomial degrees and problem sizes. Finally, we verify the findings on a full-scale CFD problem, flow around the delta wing, and report on methods' consistency as the number of processes and the number of halo elements change.
AB - Compute capability of high-performance hardware has been growing at immense rates, increasing over 130x in the last decade. Communication bandwidth, however, only grew by a factor of 6x in the same time, leading to a significant decrease in the byte-To-flop metric. This trend leads us to the situation where, in many cases, computation is virtually free, and the dominant cost of a parallel application comes from its communication cost. We expect this trend to continue and, hence, the parallel application wall-clock time to be increasingly correlated with the amount of data transferred between the nodes involved. In order to alleviate this communication bottleneck, we test several communication-reducing schemes based on the idea of using higher precision for the inner cells and lower precision communication. For every approach, we report the resulting network traffic and weigh it against the decreased accuracy. We perform our experiments in a collocated Discontinuous Galerkin finite element method framework (DG-FEM) applied in Computational Fluid Dynamics (CFD). First, we present a parametric study using the method of manufactured solutions on a 3D compressible Navier-Stokes supersonic cube. Using this method allows us to quantify communication reducing schemes' impact on the error in test cases representing a range of solution polynomial degrees and problem sizes. Finally, we verify the findings on a full-scale CFD problem, flow around the delta wing, and report on methods' consistency as the number of processes and the number of halo elements change.
UR - http://hdl.handle.net/10754/675315
UR - https://ieeexplore.ieee.org/document/9622841/
UR - http://www.scopus.com/inward/record.url?scp=85123484663&partnerID=8YFLogxK
U2 - 10.1109/HPEC49654.2021.9622841
DO - 10.1109/HPEC49654.2021.9622841
M3 - Conference contribution
SN - 9781665423694
BT - 2021 IEEE High Performance Extreme Computing Conference (HPEC)
PB - IEEE
ER -