Abstract
The vast increase in size of modern spatio-temporal data sets has prompted statisticians working in environmental applications to develop new and efficient methodologies that are still able to achieve inference for nontrivial models within an affordable time. Climate model outputs push the limits of inference for Gaussian processes, as their size can easily be larger than 10 billion data points. Drawing from our experience in a set of previous work, we provide three principles for the statistical analysis of such large data sets that leverage recent methodological and computational advances. These principles emphasize the need of embedding distributed and parallel computing in the inferential process.
Original language | English (US) |
---|---|
Pages (from-to) | 92-96 |
Number of pages | 5 |
Journal | Statistics and Probability Letters |
Volume | 136 |
DOIs | |
State | Published - May 2018 |
Keywords
- Big Data
- Climate model
- Computational statistics
- Spatio-temporal model
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty