TY - GEN
T1 - A Non-static Data Layout Enhancing Parallelism and Vectorization in Sparse Grid Algorithms
AU - Buse, Gerrit
AU - Pfluger, Dirk
AU - Murarasu, Alin
AU - Jacob, Riko
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledged KAUST grant number(s): UK-C0020
Acknowledgements: This publication is based on work supported by Award No.UK-C0020, made by King Abdullah University of Science andTechnology (KAUST).
This publication acknowledges KAUST support, but has no KAUST affiliated authors.
PY - 2012/6
Y1 - 2012/6
N2 - The name sparse grids denotes a highly space-efficient, grid-based numerical technique to approximate high-dimensional functions. Although employed in a broad spectrum of applications from different fields, there have only been few tries to use it in real time visualization (e.g. [1]), due to complex data structures and long algorithm runtime. In this work we present a novel approach inspired by principles of I/0-efficient algorithms. Locally applied coefficient permutations lead to improved cache performance and facilitate the use of vector registers for our sparse grid benchmark problem hierarchization. Based on the compact data structure proposed for regular sparse grids in [2], we developed a new algorithm that outperforms existing implementations on modern multi-core systems by a factor of 37 for a grid size of 127 million points. For larger problems the speedup is even increasing, and with execution times below 1 s, sparse grids are well-suited for visualization applications. Furthermore, we point out how a broad class of sparse grid algorithms can benefit from our approach. © 2012 IEEE.
AB - The name sparse grids denotes a highly space-efficient, grid-based numerical technique to approximate high-dimensional functions. Although employed in a broad spectrum of applications from different fields, there have only been few tries to use it in real time visualization (e.g. [1]), due to complex data structures and long algorithm runtime. In this work we present a novel approach inspired by principles of I/0-efficient algorithms. Locally applied coefficient permutations lead to improved cache performance and facilitate the use of vector registers for our sparse grid benchmark problem hierarchization. Based on the compact data structure proposed for regular sparse grids in [2], we developed a new algorithm that outperforms existing implementations on modern multi-core systems by a factor of 37 for a grid size of 127 million points. For larger problems the speedup is even increasing, and with execution times below 1 s, sparse grids are well-suited for visualization applications. Furthermore, we point out how a broad class of sparse grid algorithms can benefit from our approach. © 2012 IEEE.
UR - http://hdl.handle.net/10754/597350
UR - http://ieeexplore.ieee.org/document/6341512/
UR - http://www.scopus.com/inward/record.url?scp=84870734220&partnerID=8YFLogxK
U2 - 10.1109/ISPDC.2012.34
DO - 10.1109/ISPDC.2012.34
M3 - Conference contribution
SN - 9781467325998
SP - 195
EP - 202
BT - 2012 11th International Symposium on Parallel and Distributed Computing
PB - Institute of Electrical and Electronics Engineers (IEEE)
ER -