Data partition methodology for validation of predictive models

Rebecca E. Morrison*, Corey M. Bryant, Gabriel Terejanu, Serge Prudhomme, Kenji Miki

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

19 Scopus citations

Abstract

In many cases, model validation requires that legacy data be partitioned into calibration and validation sets, but how to do so is a nontrivial and open area of research. We present a systematic procedure to partition the data, adapted from cross-validation and in the context of predictive modeling. By considering all possible partitions, we proceed with post-processing steps to find the optimal partition of the data subject to given constraints. We are concerned here with mathematical models of physical systems whose predictions of a given unobservable quantity of interest are the basis for critical decisions. Thus, the proposed approach addresses two critical issues: (1) that the model be evaluated with respect to its ability to reproduce the data and (2) that the model be highly challenged by the validation set with respect to predictions of the quantity of interest. This framework also relies on the interaction between the experimentalist and/or modeler, who understand the physical system and the limitations of the model; the decision-maker, who understands and can quantify the cost of model failure; and the computational scientists, who strive to determine if the model satisfies both the modeler's and decision-maker's requirements. The framework is general and may be applied to a wide range of problems. It is illustrated here through an example using generated experiments of a nonlinear one degree-of-freedom oscillator.

Original languageEnglish (US)
Pages (from-to)2114-2125
Number of pages12
JournalComputers and Mathematics with Applications
Volume66
Issue number10
DOIs
StatePublished - Dec 2013

Keywords

  • Calibration
  • Cross-validation
  • Inverse problems
  • Parameter estimation
  • Quantity of interest

ASJC Scopus subject areas

  • Modeling and Simulation
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Data partition methodology for validation of predictive models'. Together they form a unique fingerprint.

Cite this