TY - CHAP
T1 - Efficient Pseudorecursive Evaluation Schemes for Non-adaptive Sparse Grids
AU - Buse, Gerrit
AU - Pflüger, Dirk
AU - Jacob, Riko
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledged KAUST grant number(s): UK-C0020
Acknowledgements: This publication is based on work supported by Award No. UK-C0020, madeby King Abdullah University of Science and Technology (KAUST). The second author wouldlike to thank the German Research Foundation (DFG) for financial support of the project withinthe Cluster of Excellence in Simulation Technology (EXC 310/1) at the University of Stuttgart.Special thanks go to Matthias Fischer, who helped with the implementation of the different sparsegrid bases.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.
PY - 2014/3/4
Y1 - 2014/3/4
N2 - In this work we propose novel algorithms for storing and evaluating sparse grid functions, operating on regular (not spatially adaptive), yet potentially dimensionally adaptive grid types. Besides regular sparse grids our approach includes truncated grids, both with and without boundary grid points. Similar to the implicit data structures proposed in Feuersänger (Dünngitterverfahren für hochdimensionale elliptische partielle Differntialgleichungen. Diploma Thesis, Institut für Numerische Simulation, Universität Bonn, 2005) and Murarasu et al. (Proceedings of the 16th ACM Symposium on Principles and Practice of Parallel Programming. Cambridge University Press, New York, 2011, pp. 25–34) we also define a bijective mapping from the multi-dimensional space of grid points to a contiguous index, such that the grid data can be stored in a simple array without overhead. Our approach is especially well-suited to exploit all levels of current commodity hardware, including cache-levels and vector extensions. Furthermore, this kind of data structure is extremely attractive for today’s real-time applications, as it gives direct access to the hierarchical structure of the grids, while outperforming other common sparse grid structures (hash maps, etc.) which do not match with modern compute platforms that well. For dimensionality d ≤ 10 we achieve good speedups on a 12 core Intel Westmere-EP NUMA platform compared to the results presented in Murarasu et al. (Proceedings of the International Conference on Computational Science—ICCS 2012. Procedia Computer Science, 2012). As we show, this also holds for the results obtained on Nvidia Fermi GPUs, for which we observe speedups over our own CPU implementation of up to 4.5 when dealing with moderate dimensionality. In high-dimensional settings, in the order of tens to hundreds of dimensions, our sparse grid evaluation kernels on the CPU outperform any other known implementation.
AB - In this work we propose novel algorithms for storing and evaluating sparse grid functions, operating on regular (not spatially adaptive), yet potentially dimensionally adaptive grid types. Besides regular sparse grids our approach includes truncated grids, both with and without boundary grid points. Similar to the implicit data structures proposed in Feuersänger (Dünngitterverfahren für hochdimensionale elliptische partielle Differntialgleichungen. Diploma Thesis, Institut für Numerische Simulation, Universität Bonn, 2005) and Murarasu et al. (Proceedings of the 16th ACM Symposium on Principles and Practice of Parallel Programming. Cambridge University Press, New York, 2011, pp. 25–34) we also define a bijective mapping from the multi-dimensional space of grid points to a contiguous index, such that the grid data can be stored in a simple array without overhead. Our approach is especially well-suited to exploit all levels of current commodity hardware, including cache-levels and vector extensions. Furthermore, this kind of data structure is extremely attractive for today’s real-time applications, as it gives direct access to the hierarchical structure of the grids, while outperforming other common sparse grid structures (hash maps, etc.) which do not match with modern compute platforms that well. For dimensionality d ≤ 10 we achieve good speedups on a 12 core Intel Westmere-EP NUMA platform compared to the results presented in Murarasu et al. (Proceedings of the International Conference on Computational Science—ICCS 2012. Procedia Computer Science, 2012). As we show, this also holds for the results obtained on Nvidia Fermi GPUs, for which we observe speedups over our own CPU implementation of up to 4.5 when dealing with moderate dimensionality. In high-dimensional settings, in the order of tens to hundreds of dimensions, our sparse grid evaluation kernels on the CPU outperform any other known implementation.
UR - http://hdl.handle.net/10754/598115
UR - http://link.springer.com/10.1007/978-3-319-04537-5_1
U2 - 10.1007/978-3-319-04537-5_1
DO - 10.1007/978-3-319-04537-5_1
M3 - Chapter
SN - 9783319045368
SP - 1
EP - 27
BT - Lecture Notes in Computational Science and Engineering
PB - Springer Nature
ER -