TY - JOUR
T1 - A parallel auxiliary grid algebraic multigrid method for graphic processing units
AU - Wang, Lu
AU - Hu, Xiaozhe
AU - Cohen, Jonathan
AU - Xu, Jinchao
N1 - Generated from Scopus record by KAUST IRTS on 2023-02-15
PY - 2013/1/1
Y1 - 2013/1/1
N2 - In this paper, we develop a new parallel auxiliary grid algebraic multigrid (AMG) method to leverage the power of graphic processing units (GPUs). In the construction of the hierarchical coarse grid, we use a simple and fixed coarsening procedure based on a region quadtree generated from an auxiliary grid. This allows us to explicitly control the sparsity patterns and operator complexities of the AMG solver. This feature provides (nearly) optimal load balancing and predictable communication patterns on shape regular grids, which makes our new algorithm suitable for parallel computing, especially on GPUs. We also design a parallel smoother based on the special coloring of the quadtree to accelerate the convergence rate and improve the parallel performance of this solver. Based on the CUDA toolkit [NVIDIA CUDA Programming Guide, NVIDIA Corp., 2010], we implemented our new parallel auxiliary grid AMG method on GPUs and the numerical results of this implementation demonstrate the efficiency of our new method for (nearly) isotropic problems. The results achieve an average speedup of over 4 on quasi-uniform grids and 2 on shape regular grids when compared to the AMG implementation in CUSP [M. Garland and N. Bell, CUSP: Generic Parallel Algorithms for Sparse Matrix and Graph Computations, http://cusplibrary.github. com/ (2010)]. © 2013 Society for Industrial and Applied Mathematics.
AB - In this paper, we develop a new parallel auxiliary grid algebraic multigrid (AMG) method to leverage the power of graphic processing units (GPUs). In the construction of the hierarchical coarse grid, we use a simple and fixed coarsening procedure based on a region quadtree generated from an auxiliary grid. This allows us to explicitly control the sparsity patterns and operator complexities of the AMG solver. This feature provides (nearly) optimal load balancing and predictable communication patterns on shape regular grids, which makes our new algorithm suitable for parallel computing, especially on GPUs. We also design a parallel smoother based on the special coloring of the quadtree to accelerate the convergence rate and improve the parallel performance of this solver. Based on the CUDA toolkit [NVIDIA CUDA Programming Guide, NVIDIA Corp., 2010], we implemented our new parallel auxiliary grid AMG method on GPUs and the numerical results of this implementation demonstrate the efficiency of our new method for (nearly) isotropic problems. The results achieve an average speedup of over 4 on quasi-uniform grids and 2 on shape regular grids when compared to the AMG implementation in CUSP [M. Garland and N. Bell, CUSP: Generic Parallel Algorithms for Sparse Matrix and Graph Computations, http://cusplibrary.github. com/ (2010)]. © 2013 Society for Industrial and Applied Mathematics.
UR - http://epubs.siam.org/doi/10.1137/120894452
UR - http://www.scopus.com/inward/record.url?scp=84883696475&partnerID=8YFLogxK
U2 - 10.1137/120894452
DO - 10.1137/120894452
M3 - Article
SN - 1095-7200
VL - 35
JO - SIAM Journal on Scientific Computing
JF - SIAM Journal on Scientific Computing
IS - 3
ER -