TY - JOUR
T1 - High performance shallow water kernels for parallel overland flow simulations based on FullSWOF2D
AU - Wittmann, Roland
AU - Bungartz, Hans-Joachim
AU - Neumann, Philipp
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledgements: We thank for financial support by the Kompetenznetzwerk für Wissenschaftliches Höchstleistungsrechnen in Bayern (KONWIHR) for the Multicore-Software-Initiative with the project “Optimization of a multi-functional shallow water solver for complex overland flows” (KONWIHR-IV) and KAUST Supercomputing Laboratory for providing the access to supercomputer Shaheen 2 (project k1050). We thank Ralf-Peter Mundani and Florian Mintgen from the Department of Civil, Geo and Environmental Engineering of the Technical University of Munich for access to the Glasgow scenario datasets as well as for the fruitful discussions on this scenario.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.
PY - 2017/1/25
Y1 - 2017/1/25
N2 - We describe code optimization and parallelization procedures applied to the sequential overland flow solver FullSWOF2D. Major difficulties when simulating overland flows comprise dealing with high resolution datasets of large scale areas which either cannot be computed on a single node either due to limited amount of memory or due to too many (time step) iterations resulting from the CFL condition. We address these issues in terms of two major contributions. First, we demonstrate a generic step-by-step transformation of the second order finite volume scheme in FullSWOF2D towards MPI parallelization. Second, the computational kernels are optimized by the use of templates and a portable vectorization approach. We discuss the load imbalance of the flux computation due to dry and wet cells and propose a solution using an efficient cell counting approach. Finally, scalability results are shown for different test scenarios along with a flood simulation benchmark using the Shaheen II supercomputer.
AB - We describe code optimization and parallelization procedures applied to the sequential overland flow solver FullSWOF2D. Major difficulties when simulating overland flows comprise dealing with high resolution datasets of large scale areas which either cannot be computed on a single node either due to limited amount of memory or due to too many (time step) iterations resulting from the CFL condition. We address these issues in terms of two major contributions. First, we demonstrate a generic step-by-step transformation of the second order finite volume scheme in FullSWOF2D towards MPI parallelization. Second, the computational kernels are optimized by the use of templates and a portable vectorization approach. We discuss the load imbalance of the flux computation due to dry and wet cells and propose a solution using an efficient cell counting approach. Finally, scalability results are shown for different test scenarios along with a flood simulation benchmark using the Shaheen II supercomputer.
UR - http://hdl.handle.net/10754/624957
UR - https://linkinghub.elsevier.com/retrieve/pii/S0898122117300159
UR - http://www.scopus.com/inward/record.url?scp=85010203564&partnerID=8YFLogxK
U2 - 10.1016/j.camwa.2017.01.005
DO - 10.1016/j.camwa.2017.01.005
M3 - Article
SN - 0898-1221
VL - 74
SP - 110
EP - 125
JO - Computers & Mathematics with Applications
JF - Computers & Mathematics with Applications
IS - 1
ER -