TY - JOUR
T1 - Block Fusion on Dynamically Adaptive Spacetree Grids for Shallow Water Waves
AU - Weinzierl, Tobias
AU - Bader, Michael
AU - Unterweger, Kristof
AU - Wittmann, Roland
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledged KAUST grant number(s): UK-c0020
Acknowledgements: Tobias Weinzierl appreciates the support of the School of Engineering and Computing Sciences and in particular Tomasz Koziara at Durham University for providing
This publication acknowledges KAUST support, but has no KAUST affiliated authors.
PY - 2014/9/29
Y1 - 2014/9/29
N2 - © 2014 World Scientific Publishing Company. Spacetrees are a popular formalism to describe dynamically adaptive Cartesian grids. Even though they directly yield a mesh, it is often computationally reasonable to embed regular Cartesian blocks into their leaves. This promotes stencils working on homogeneous data chunks. The choice of a proper block size is sensitive. While large block sizes foster loop parallelism and vectorisation, they restrict the adaptivity's granularity and hence increase the memory footprint and lower the numerical accuracy per byte. In the present paper, we therefore use a multiscale spacetree-block coupling admitting blocks on all spacetree nodes. We propose to find sets of blocks on the finest scale throughout the simulation and to replace them by fused big blocks. Such a replacement strategy can pick up hardware characteristics, i.e. which block size yields the highest throughput, while the dynamic adaptivity of the fine grid mesh is not constrained - applications can work with fine granular blocks. We study the fusion with a state-of-the-art shallow water solver at hands of an Intel Sandy Bridge and a Xeon Phi processor where we anticipate their reaction to selected block optimisation and vectorisation.
AB - © 2014 World Scientific Publishing Company. Spacetrees are a popular formalism to describe dynamically adaptive Cartesian grids. Even though they directly yield a mesh, it is often computationally reasonable to embed regular Cartesian blocks into their leaves. This promotes stencils working on homogeneous data chunks. The choice of a proper block size is sensitive. While large block sizes foster loop parallelism and vectorisation, they restrict the adaptivity's granularity and hence increase the memory footprint and lower the numerical accuracy per byte. In the present paper, we therefore use a multiscale spacetree-block coupling admitting blocks on all spacetree nodes. We propose to find sets of blocks on the finest scale throughout the simulation and to replace them by fused big blocks. Such a replacement strategy can pick up hardware characteristics, i.e. which block size yields the highest throughput, while the dynamic adaptivity of the fine grid mesh is not constrained - applications can work with fine granular blocks. We study the fusion with a state-of-the-art shallow water solver at hands of an Intel Sandy Bridge and a Xeon Phi processor where we anticipate their reaction to selected block optimisation and vectorisation.
UR - http://hdl.handle.net/10754/597685
UR - https://www.worldscientific.com/doi/abs/10.1142/S0129626414410060
UR - http://www.scopus.com/inward/record.url?scp=84929587304&partnerID=8YFLogxK
U2 - 10.1142/S0129626414410060
DO - 10.1142/S0129626414410060
M3 - Article
SN - 0129-6264
VL - 24
SP - 1441006
JO - Parallel Processing Letters
JF - Parallel Processing Letters
IS - 03
ER -