TY - JOUR

T1 - The dangers of sparse sampling for the quantification of margin and uncertainty

AU - Hemez, François M.

AU - Atamturktur, Sezer

N1 - Funding Information:
This work is performed under the auspices of the Verification and Validation (V&V) program for Advanced Scientific Computing (ASC) at Los Alamos National Laboratory. The first author is grateful to Mark Anderson, V&V program manager at LANL, for his continuing support. The authors also express their gratitude to Professor Derek Bingham, Simon-Frasier University, Vancouver, Canada, for his kind willingness to share his insight with them. Los Alamos National Laboratory is operated by the Los Alamos National Security, LLC for the National Nuclear Security Administration of the U.S. Department of Energy under contract DE-AC52-06NA25396 .

PY - 2011/9

Y1 - 2011/9

N2 - Activities such as global sensitivity analysis, statistical effect screening, uncertainty propagation, or model calibration have become integral to the Verification and Validation (V&V) of numerical models and computer simulations. One of the goals of V&V is to assess prediction accuracy and uncertainty, which feeds directly into reliability analysis or the Quantification of Margin and Uncertainty (QMU) of engineered systems. Because these analyses involve multiple runs of a computer code, they can rapidly become computationally expensive. An alternative to Monte Carlo-like sampling is to combine a design of computer experiments to meta-modeling, and replace the potentially expensive computer simulation by a fast-running emulator. The surrogate can then be used to estimate sensitivities, propagate uncertainty, and calibrate model parameters at a fraction of the cost it would take to wrap a sampling algorithm or optimization solver around the physics-based code. Doing so, however, offers the risk to develop an incorrect emulator that erroneously approximates the true-but-unknown sensitivities of the physics-based code. We demonstrate the extent to which this occurs when Gaussian Process Modeling (GPM) emulators are trained in high-dimensional spaces using too-sparsely populated designs-of-experiments. Our illustration analyzes a variant of the Rosenbrock function in which several effects are made statistically insignificant while others are strongly coupled, therefore, mimicking a situation that is often encountered in practice. In this example, using a combination of GPM emulator and design-of-experiments leads to an incorrect approximation of the function. A mathematical proof of the origin of the problem is proposed. The adverse effects that too-sparsely populated designs may produce are discussed for the coverage of the design space, estimation of sensitivities, and calibration of parameters. This work attempts to raise awareness to the potential dangers of not allocating enough resources when exploring a design space to develop fast-running emulators.

AB - Activities such as global sensitivity analysis, statistical effect screening, uncertainty propagation, or model calibration have become integral to the Verification and Validation (V&V) of numerical models and computer simulations. One of the goals of V&V is to assess prediction accuracy and uncertainty, which feeds directly into reliability analysis or the Quantification of Margin and Uncertainty (QMU) of engineered systems. Because these analyses involve multiple runs of a computer code, they can rapidly become computationally expensive. An alternative to Monte Carlo-like sampling is to combine a design of computer experiments to meta-modeling, and replace the potentially expensive computer simulation by a fast-running emulator. The surrogate can then be used to estimate sensitivities, propagate uncertainty, and calibrate model parameters at a fraction of the cost it would take to wrap a sampling algorithm or optimization solver around the physics-based code. Doing so, however, offers the risk to develop an incorrect emulator that erroneously approximates the true-but-unknown sensitivities of the physics-based code. We demonstrate the extent to which this occurs when Gaussian Process Modeling (GPM) emulators are trained in high-dimensional spaces using too-sparsely populated designs-of-experiments. Our illustration analyzes a variant of the Rosenbrock function in which several effects are made statistically insignificant while others are strongly coupled, therefore, mimicking a situation that is often encountered in practice. In this example, using a combination of GPM emulator and design-of-experiments leads to an incorrect approximation of the function. A mathematical proof of the origin of the problem is proposed. The adverse effects that too-sparsely populated designs may produce are discussed for the coverage of the design space, estimation of sensitivities, and calibration of parameters. This work attempts to raise awareness to the potential dangers of not allocating enough resources when exploring a design space to develop fast-running emulators.

UR - http://www.scopus.com/inward/record.url?scp=79959594169&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79959594169&partnerID=8YFLogxK

U2 - 10.1016/j.ress.2011.02.015

DO - 10.1016/j.ress.2011.02.015

M3 - Article

AN - SCOPUS:79959594169

VL - 96

SP - 1220

EP - 1231

JO - Reliability Engineering and System Safety

JF - Reliability Engineering and System Safety

SN - 0951-8320

IS - 9

ER -