Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
P
psimpy
Manage
Activity
Members
Labels
Plan
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Container registry
Model registry
Operate
Environments
Terraform modules
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
GitLab community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
This is an archived project. Repository and other project resources are read-only.
Show more breadcrumbs
mbd
psimpy
Commits
94386e4d
Commit
94386e4d
authored
Dec 6, 2022
by
Hu Zhao
Browse files
Options
Downloads
Patches
Plain Diff
docs: update documentation for emulator
parent
fcbe4371
No related branches found
No related tags found
No related merge requests found
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
docs/source/emulator/index.rst
+14
-4
14 additions, 4 deletions
docs/source/emulator/index.rst
docs/source/emulator/robustgasp.rst
+155
-7
155 additions, 7 deletions
docs/source/emulator/robustgasp.rst
with
169 additions
and
11 deletions
docs/source/emulator/index.rst
+
14
−
4
View file @
94386e4d
********
Emulator
Emulator
========
********
This module hosts functionality for emulation methods.
This module hosts functionality for emulation methods.
Currently implemented classes are:
Emulation, also known as surrogate modeling or meta-modeling, is a type of
method which is used to build cheap-to-evaluate emulators (surrogate models) to
approximate expensive-to-evaluate simulators. It can greatly reduce
computational costs of tasks in which a computationally intensive simulator
needs to be run multiple times, such as a global sensitivity anlaysis,
uncertainty quantification, or parameter calibration.
* :class:`.ScalarGaSP`
:
Currently implemented classes in this module are
:
* :class:`.PPGaSP`:
* :class:`.ScalarGaSP`: Gaussian process emulation for simulators which return a
scalar value as output.
* :class:`.PPGaSP`: Gaussian process emulation for simulators which return
multiple values as output.
.. toctree::
.. toctree::
:maxdepth: 1
:maxdepth: 1
...
...
This diff is collapsed.
Click to expand it.
docs/source/emulator/robustgasp.rst
+
155
−
7
View file @
94386e4d
RobustGaSP
RobustGaSP
----------
==========
`RobustGaSP`, standing for `Robust Gaussian Stochastic Process Emulation`, is a
`R` package developed by :cite:t:`Gu2019`. In this module,
we have implemented class :class:`.ScalarGaSP` and class :class:`.PPGaSP`
based on `RobustGaSP` and `rpy2`, which allows us to use `RobustGaSP` directly
from within `Python`.
Theory of Gaussian process emulation
------------------------------------
A simulator represents a mapping from an input space to an output space.
Gaussian process emulation treats a computationally expensive simulator as an
unknown function from a Bayesian perspective: the prior belief of the simulator,
namely a Gaussian process, is updated based on a modest number of simulation runs,
leading to a posterior which can be evaluated much faster than the simulator and
can then be used for computationally demanding analyses :cite:p:`Zhao2021a`. A recap
of Gaussian process emulation is provided below, which is adapted from `Chapter 3`
of :cite:t:`Zhao2021b`.
Gaussian process prior
^^^^^^^^^^^^^^^^^^^^^^
Let :math:`f(\cdot)` denote a simulator that one wants to approximate. It
defines a mapping from a :math:`p`-dimensional input
:math:`\mathbf{x}=(x_1,\ldots,x_p)^T \in \mathcal{X} \subset{\mathbb{R}^p}`
to a scalar output :math:`y \in \mathbb{R}`.
From the Bayesian perspective, the prior belief of the simulator is modeled by
a Gaussian process, namely
.. math:: f(\cdot) \sim \mathcal{GP}(m(\cdot), C(\cdot,\cdot)).
:label: GP
The Gaussian process is fully specified by its mean function :math:`m(\cdot)`
and covariance function :math:`C(\cdot,\cdot)`. The mean function is usually
modeled by parametric regression
.. math:: m(\mathbf{x})=\mathbf{h}^T(\mathbf{x}) \boldsymbol{\beta}=\sum_{t=1}^{q} h_{t}(\mathbf{x}) \beta_{t},
:label: GPmean
with :math:`q`-dimensional vectors
:math:`\mathbf{h}(\mathbf{x})=\left(h_{1}(\mathbf{x}), h_{2}(\mathbf{x}), \ldots, h_{q}(\mathbf{x})\right)^T`
and :math:`\boldsymbol{\beta}=\left(\beta_{1}, \beta_{2}, \ldots, \beta_{q}\right)^T`
denoting the chosen basis functions and unknown regression parameters respectively.
The basis functions are commonly chosen as constant :math:`\mathbf{h}(\mathbf{x})=1`
or some prescribed polynomials like :math:`\mathbf{h}(\mathbf{x})=(1,x_1,\ldots,x_p)^T`.
The covariance function is often assumed to be stationary and typically has a
separable form of
.. math:: C\left(\mathbf{x}_{i}, \mathbf{x}_{j}\right)=\sigma^2 c\left(\mathbf{x}_{i}, \mathbf{x}_{j}\right)=\sigma^2 \prod_{l=1}^{p} c_{l}\left(x_{i l}, x_{j l}\right),
:label: GPcovariance
where :math:`c\left(\mathbf{x}_{i}, \mathbf{x}_{j}\right)` is the correlation
function, and :math:`c_{l}\left(x_{i l}, x_{j l}\right)` corresponds to the
correlation between the :math:`l`-th component of :math:`\mathbf{x}_i`` and
the counterpart of :math:`\mathbf{x}_j`. Various correlation functions can be
chosen for :math:`c_{l}\left(x_{i l}, x_{j l}\right)`. In what follows a specific
correlation function from the Matern family is chosen to illustrate the theory.
It should be noted that it can be replaced by any other correlation function
without loss of generality.
.. math:: c_{l}\left(x_{i l}, x_{j l}\right) = \left(1+\frac{\sqrt{5}||x_{il}-x_{jl}||}{\psi_l}+\frac{5||x_{il}-x_{jl}||^2}{3\psi_{l}^{2}} \right) \exp{\left(-\frac{\sqrt{5}||x_{il}-x_{jl}||}{\psi_l}\right)},
:label: GPcorrelation
where :math:`\psi_l` is the correlation length parameter in the :math:`l`-th
dimension. The unknowns in the covariance function
:math:`C\left(\mathbf{x}_{i}, \mathbf{x}_{j}\right)` are the variance
:math:`\sigma^2` and the :math:`p` correlation length parameters
:math:`\boldsymbol{\psi}=(\psi_1,\ldots,\psi_p)^T`.
Gaussian process posterior
^^^^^^^^^^^^^^^^^^^^^^^^^^
Equations :eq:`GP` to :eq:`GPcorrelation` encode prior knowledge of the simulator
:math:`f(\cdot)`. From the Bayesian viewpoint, this prior knowledge can be updated
to a posterior based on a limited number of training data. Training data here refer
to :math:`n^{tr}` pairs of input-output data of :math:`f(\cdot)`.
Let :math:`\mathbf{x}^{tr}:=\{\mathbf{x}_i\}_{i=1}^{n^{tr}}` and
:math:`\mathbf{y}^{tr}:=\{f(\mathbf{x}_i)\}_{i=1}^{n^{tr}}` denote the training
inputs and outputs respectively. Since any finite number of random variables
from a Gaussian process are jointly Gaussian distributed, the joint distribution
of the :math:`n^{tr}` outputs :math:`\mathbf{y}^{tr}` follow a
:math:`n^{tr}`-dimensional Gaussian distribution
.. math:: p(\mathbf{y}^{tr} | \boldsymbol{\beta}, \sigma^{2}, \boldsymbol{\psi}) \sim \mathcal{N}_{n^{tr}}\left(\mathbf{H}\boldsymbol{\beta}, \sigma^{2} \mathbf{R}\right),
:label: GPlikelihood1
where :math:`\mathbf{H}=\left[\mathbf{h}(\mathbf{x}_{1}), \ldots, \mathbf{h}(\mathbf{x}_{n^{tr}}) \right]^T`
is the :math:`n^{tr} \times q` basis design matrix and :math:`\mathbf{R}` is the
:math:`n^{tr} \times n^{tr}` correlation matrix with :math:`(i,j)` element
:math:`c(\mathbf{x}_{i}, \mathbf{x}_{j})`. Equation :eq:`GPlikelihood1` is known
as the likelihood function.
Similarly, the joint distribution of the function output :math:`y^*` at any
untried input :math:`\mathbf{x}^*` and :math:`\mathbf{y}^{tr}` follows a
:math:`(n^{tr}+1)`-dimensional Gaussian distribution
.. math:: p(y^*, \mathbf{y}^{tr} \mid \boldsymbol{\beta}, \sigma^{2}, \boldsymbol{\psi}) \sim \mathcal{N}_{n^{tr}+1}\left(\left[\begin{array}{c}
\mathbf{h}^T\left(\mathbf{x}^*\right) \\
\mathbf{H}
\end{array}\right] \boldsymbol{\beta}, \sigma^{2}\left[\begin{array}{cc}
c\left(\mathbf{x}^*, \mathbf{x}^*\right) & \mathbf{r}^T\left(\mathbf{x}^*\right) \\
\mathbf{r}\left(\mathbf{x}^*\right) & \mathbf{R}
\end{array}\right]\right)
:label: jointystarytr
where :math:`\mathbf{r}(\mathbf{x}^*)=\left(c(\mathbf{x}^*, \mathbf{x}_{1}), \ldots, c(\mathbf{x}^*, \mathbf{x}_{n^{tr}}) \right)^T`.
According to the property of the multivariate Gaussian distribution, the conditional
distribution of :math:`y^*` conditioned on :math:`\mathbf{y}^{tr}` is again a
Gaussian distribution, namely
.. math:: p(y^* | \mathbf{y}^{tr}, \boldsymbol{\beta}, \sigma^{2}, \boldsymbol{\psi}) \sim \mathcal{N}
\left(m', \sigma^2 c' \right),\label{eq:conditionalGaussian}
:label: conditionalGaussian
.. math:: m'=\mathbf{h}^T(\mathbf{x}^*) \boldsymbol{\beta} + \mathbf{r}^T(\mathbf{x}^*) \mathbf{R}^{-1} (\mathbf{y}^{tr}-\mathbf{H}\boldsymbol{\beta})
:label: conditionalGaussianM
.. math:: c'=c(\mathbf{x}^*, \mathbf{x}^*) - \mathbf{r}^T(\mathbf{x}^*) \mathbf{R}^{-1} \mathbf{r}(\mathbf{x}^*).
:label: conditionalGaussianC
Equations :eq:`conditionalGaussian` to :eq:`conditionalGaussianC` can be easily
extended to any dimensionality since the joint distribution of the function
outputs at any number of untried inputs and :math:`\mathbf{y}^{tr}` also follows
a multivariate Gaussian distribution. Equations :eq:`conditionalGaussian` to
:eq:`conditionalGaussianC` therefore essentially define the posterior which is an
updated Gaussian process, given the unknown regression parameters
:math:`\boldsymbol{\beta}`, variance :math:`\sigma^2`, and correlation length
parameters :math:`\boldsymbol{\psi}`.
For any new input :math:`\mathbf{x}^*`, the output :math:`y^*` can be almost
instantaneously predicted by the mean :math:`m'`, namely equation :eq:`conditionalGaussianM`.
Moreover, the uncertainty associated with the prediction can be estimated by
for example the variance :math:`\sigma^2c'`. Depending on how the unknown parameters
(:math:`\boldsymbol{\beta}`, :math:`\sigma^2`, and :math:`\boldsymbol{\psi}`)
are learned based on the training data, the posterior can have various forms, see
`Section 3.3` of :cite:t:`Zhao2021b`.
ScalarGaSP Class
ScalarGaSP Class
^^^^^^^^^^^^^^^^
----------------
Class :class:`.ScalarGaSP` provides Gaussian process emulation for simulators
which return a scalar value as output, namely :math:`y \in \mathbb{R}`. Please
refer to :cite:t:`Gu2018` for the detailed theory.
The :class:`.ScalarGaSP` class
is imported by::
It
is imported by::
from psimpy.emulator.robustgasp import ScalarGaSP
from psimpy.emulator.robustgasp import ScalarGaSP
Methods
Methods
*******
^^^^^^^
.. autoclass:: psimpy.emulator.robustgasp.ScalarGaSP
.. autoclass:: psimpy.emulator.robustgasp.ScalarGaSP
:members: train, predict, sample, loo_validate
:members: train, predict, sample, loo_validate
PPGaSP Class
PPGaSP Class
^^^^^^^^^^^^
------------
Class :class:`.PPGaSP` provides Gaussian process emulation for simulators
which return multiple values as output, namely :math:`y \in \mathbb{R}^k, k>1`.
Please refer to :cite:t:`Gu2016` for the detailed theory.
The :class:`.PPGaSP` class
is imported by::
It
is imported by::
from psimpy.emulator.robustgasp import PPGaSP
from psimpy.emulator.robustgasp import PPGaSP
Methods
Methods
*******
^^^^^^^
.. autoclass:: psimpy.emulator.robustgasp.PPGaSP
.. autoclass:: psimpy.emulator.robustgasp.PPGaSP
:members: train, predict, sample
:members: train, predict, sample
\ No newline at end of file
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment