TY - GEN
T1 - Compact data structure and scalable algorithms for the sparse grid technique
AU - Murarasu, Alin
AU - Weidendorfer, Josef
AU - Buse, Gerrit
AU - Butnaru, Daniel
AU - Pflüger, Dirk
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledged KAUST grant number(s): UKC0020
Acknowledgements: This publication is based on work supported by Award No. UKC0020,made by King Abdullah University of Science and Technology(KAUST).
This publication acknowledges KAUST support, but has no KAUST affiliated authors.
PY - 2011
Y1 - 2011
N2 - The sparse grid discretization technique enables a compressed representation of higher-dimensional functions. In its original form, it relies heavily on recursion and complex data structures, thus being far from well-suited for GPUs. In this paper, we describe optimizations that enable us to implement compression and decompression, the crucial sparse grid algorithms for our application, on Nvidia GPUs. The main idea consists of a bijective mapping between the set of points in a multi-dimensional sparse grid and a set of consecutive natural numbers. The resulting data structure consumes a minimum amount of memory. For a 10-dimensional sparse grid with approximately 127 million points, it consumes up to 30 times less memory than trees or hash tables which are typically used. Compared to a sequential CPU implementation, the speedups achieved on GPU are up to 17 for compression and up to 70 for decompression, respectively. We show that the optimizations are also applicable to multicore CPUs. Copyright © 2011 ACM.
AB - The sparse grid discretization technique enables a compressed representation of higher-dimensional functions. In its original form, it relies heavily on recursion and complex data structures, thus being far from well-suited for GPUs. In this paper, we describe optimizations that enable us to implement compression and decompression, the crucial sparse grid algorithms for our application, on Nvidia GPUs. The main idea consists of a bijective mapping between the set of points in a multi-dimensional sparse grid and a set of consecutive natural numbers. The resulting data structure consumes a minimum amount of memory. For a 10-dimensional sparse grid with approximately 127 million points, it consumes up to 30 times less memory than trees or hash tables which are typically used. Compared to a sequential CPU implementation, the speedups achieved on GPU are up to 17 for compression and up to 70 for decompression, respectively. We show that the optimizations are also applicable to multicore CPUs. Copyright © 2011 ACM.
UR - http://hdl.handle.net/10754/597803
UR - http://portal.acm.org/citation.cfm?doid=1941553.1941559
UR - http://www.scopus.com/inward/record.url?scp=79952801486&partnerID=8YFLogxK
U2 - 10.1145/1941553.1941559
DO - 10.1145/1941553.1941559
M3 - Conference contribution
SN - 9781450301190
SP - 35
EP - 45
BT - Proceedings of the 16th ACM symposium on Principles and practice of parallel programming - PPoPP '11
PB - Association for Computing Machinery (ACM)
ER -