TY - GEN
T1 - GPU-Accelerated Vecchia Approximations of Gaussian Processes for Geospatial Data using Batched Matrix Computations
AU - Pan, Qilong
AU - Abdulah, Sameh
AU - Genton, Marc G.
AU - Keyes, David E.
AU - Ltaief, Hatem
AU - Sun, Ying
N1 - Publisher Copyright:
© 2024 Research Paper Proceedings of the ISC High Performance 2024. All rights reserved.
PY - 2024
Y1 - 2024
N2 - Gaussian processes (GPs) are commonly used for geospatial analysis, but they suffer from high computational complexity when dealing with massive data. For instance, the log-likelihood function required in estimating the statistical model parameters for geospatial data is a computationally intensive procedure that involves computing the inverse of a covariance matrix with size n × n, where n represents the number of geographical locations in the simplest case. As a result, in the literature, studies have shifted towards approximation methods to handle larger values of n effectively while maintaining high accuracy. These methods encompass a range of techniques, including low-rank and sparse approximations. Among these techniques, Vecchia approximation is one of the most promising methods to speed up evaluating the log-likelihood function. This study presents a parallel implementation of the Vecchia approximation technique, utilizing batched matrix computations on contemporary GPUs. The proposed implementation relies on batched linear algebra routines to efficiently execute individual conditional distributions in the Vecchia algorithm. We rely on the KBLAS linear algebra library to perform batched linear algebra operations, reducing the time to solution compared to the state-of-the-art parallel implementation of the likelihood estimation operation in the ExaGeoStat software by up to 700X, 833X, 1380X on 32GB GV100, 80GB A100, and 80GB H100 GPUs, respectively, with the largest matrix dimension that can fully fit into the GPU memory in the dense Maximum Likelihood Estimation (MLE) case. We also successfully manage larger problem sizes on a single NVIDIA GPU, accommodating up to 1 million locations with 80GB A100 and H100 GPUs while maintaining the necessary application accuracy. We further assess the accuracy performance of the implemented algorithm, identifying the optimal settings for the Vecchia approximation algorithm to preserve accuracy on two real geospatial datasets: soil moisture data in the Mississippi Basin area and wind speed data in the Middle East.
AB - Gaussian processes (GPs) are commonly used for geospatial analysis, but they suffer from high computational complexity when dealing with massive data. For instance, the log-likelihood function required in estimating the statistical model parameters for geospatial data is a computationally intensive procedure that involves computing the inverse of a covariance matrix with size n × n, where n represents the number of geographical locations in the simplest case. As a result, in the literature, studies have shifted towards approximation methods to handle larger values of n effectively while maintaining high accuracy. These methods encompass a range of techniques, including low-rank and sparse approximations. Among these techniques, Vecchia approximation is one of the most promising methods to speed up evaluating the log-likelihood function. This study presents a parallel implementation of the Vecchia approximation technique, utilizing batched matrix computations on contemporary GPUs. The proposed implementation relies on batched linear algebra routines to efficiently execute individual conditional distributions in the Vecchia algorithm. We rely on the KBLAS linear algebra library to perform batched linear algebra operations, reducing the time to solution compared to the state-of-the-art parallel implementation of the likelihood estimation operation in the ExaGeoStat software by up to 700X, 833X, 1380X on 32GB GV100, 80GB A100, and 80GB H100 GPUs, respectively, with the largest matrix dimension that can fully fit into the GPU memory in the dense Maximum Likelihood Estimation (MLE) case. We also successfully manage larger problem sizes on a single NVIDIA GPU, accommodating up to 1 million locations with 80GB A100 and H100 GPUs while maintaining the necessary application accuracy. We further assess the accuracy performance of the implemented algorithm, identifying the optimal settings for the Vecchia approximation algorithm to preserve accuracy on two real geospatial datasets: soil moisture data in the Mississippi Basin area and wind speed data in the Middle East.
KW - batched solvers
KW - Gaussian processes (GPs)
KW - GPU computing
KW - linear algebra
KW - Vecchia approximation
UR - http://www.scopus.com/inward/record.url?scp=85195156274&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85195156274
T3 - Research Paper Proceedings of the ISC High Performance 2024
BT - Research Paper Proceedings of the ISC High Performance 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 39th International Conference on High Performance Computing, ISC High Performance 2024
Y2 - 12 May 2024 through 16 May 2024
ER -