Abstract
The paper presents a multi-GPU implementation of the preconditioned conjugate gradient algorithm with an algebraic multigrid preconditioner (PCG-AMG) for an elliptic model problem on a 3D unstructured grid. An efficient parallel sparse matrix-vector multiplication scheme underlying the PCG-AMG algorithm is presented for the many-core GPU architecture. A performance comparison of the parallel solver shows that a singe Nvidia Tesla C1060 GPU board delivers the performance of a sixteen node Infiniband cluster and a multi-GPU configuration with eight GPUs is about 100 times faster than a typical server CPU core. © 2010 Springer-Verlag.
Original language | English (US) |
---|---|
Title of host publication | High Performance Computing and Applications |
Publisher | Springer Nature |
Pages | 38-47 |
Number of pages | 10 |
ISBN (Print) | 9783642118418 |
DOIs | |
State | Published - 2010 |
Externally published | Yes |