TY - GEN
T1 - Parallel Approximations for High-Dimensional Multivariate Normal Probability Computation in Confidence Region Detection Applications
AU - Zhang, Xiran
AU - Abdulah, Sameh
AU - Cao, Jian
AU - Ltaief, Hatem
AU - Sun, Ying
AU - Genton, Marc G.
AU - Keyes, David E.
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Addressing the statistical challenge of computing the multivariate normal (MVN) probability in high dimensions holds significant potential for enhancing various applications. For example, the critical task of detecting confidence regions where a process probability surpasses a specific threshold is essential in diverse applications, such as pinpointing tumor locations in magnetic resonance imaging (MRI) scan images, determining hydraulic parameters in groundwater flow issues, and forecasting regional wind power to optimize wind turbine placement, among numerous others. One common way to compute high-dimensional MVN probabilities is the Separation-of-Variables (SOV) algorithm. This algorithm is known for its high computational complexity of O(n3) and space complexity of O(n2), mainly due to a Cholesky factorization operation for an n×n covariance matrix, where n represents the dimensionality of the MVN problem. This work proposes a high-performance computing framework that allows scaling the SOV algorithm and, subsequently, the confidence region detection algorithm. The framework leverages parallel linear algebra algorithms with a task-based programming model to achieve performance scalability in computing process probabilities, especially on large-scale systems. In addition, we enhance our implementation by incorporating Tile Low-Rank (TLR) approximation techniques to reduce algorithmic complexity without compromising the necessary accuracy. To evaluate the performance and accuracy of our framework, we conduct assessments using simulated data and a wind speed dataset. Our proposed implementation effectively handles high-dimensional multivariate normal (MVN) probability computations on shared and distributed-memory systems using finite precision arithmetics and TLR approximation computation. Performance results show a significant speedup of up to 20X in solving the MVN problem using TLR approximation compared to the reference dense solution without sacrificing the application's accuracy. The qualitative results on synthetic and real datasets demonstrate how we maintain high accuracy in detecting confidence regions even when relying on TLR approximation to perform the underlying linear algebra operations.
AB - Addressing the statistical challenge of computing the multivariate normal (MVN) probability in high dimensions holds significant potential for enhancing various applications. For example, the critical task of detecting confidence regions where a process probability surpasses a specific threshold is essential in diverse applications, such as pinpointing tumor locations in magnetic resonance imaging (MRI) scan images, determining hydraulic parameters in groundwater flow issues, and forecasting regional wind power to optimize wind turbine placement, among numerous others. One common way to compute high-dimensional MVN probabilities is the Separation-of-Variables (SOV) algorithm. This algorithm is known for its high computational complexity of O(n3) and space complexity of O(n2), mainly due to a Cholesky factorization operation for an n×n covariance matrix, where n represents the dimensionality of the MVN problem. This work proposes a high-performance computing framework that allows scaling the SOV algorithm and, subsequently, the confidence region detection algorithm. The framework leverages parallel linear algebra algorithms with a task-based programming model to achieve performance scalability in computing process probabilities, especially on large-scale systems. In addition, we enhance our implementation by incorporating Tile Low-Rank (TLR) approximation techniques to reduce algorithmic complexity without compromising the necessary accuracy. To evaluate the performance and accuracy of our framework, we conduct assessments using simulated data and a wind speed dataset. Our proposed implementation effectively handles high-dimensional multivariate normal (MVN) probability computations on shared and distributed-memory systems using finite precision arithmetics and TLR approximation computation. Performance results show a significant speedup of up to 20X in solving the MVN problem using TLR approximation compared to the reference dense solution without sacrificing the application's accuracy. The qualitative results on synthetic and real datasets demonstrate how we maintain high accuracy in detecting confidence regions even when relying on TLR approximation to perform the underlying linear algebra operations.
KW - Cholesky factorization
KW - Confidence region detection
KW - Excursion Set
KW - Multivariate normal probability
KW - Separation-of-Variables algorithm
KW - Tile low-rank
UR - http://www.scopus.com/inward/record.url?scp=85198901580&partnerID=8YFLogxK
U2 - 10.1109/IPDPS57955.2024.00031
DO - 10.1109/IPDPS57955.2024.00031
M3 - Conference contribution
AN - SCOPUS:85198901580
T3 - Proceedings - 2024 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2024
SP - 265
EP - 276
BT - Proceedings - 2024 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 38th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2024
Y2 - 27 May 2024 through 31 May 2024
ER -