Skip to content
Snippets Groups Projects
Commit b7f67bd9 authored by Hu Zhao's avatar Hu Zhao
Browse files

docs: remove outdated files

parent 3e038d48
Branches
Tags
No related merge requests found
```{include} ../CONTRIBUTING.md
```
\ No newline at end of file
%% Cell type:markdown id:3bae4a70 tags:
# Gaussian process emulation
This example demonstrates how to conduct Gaussian process emulation using our wrapper of the R package RobustGaSP.
%% Cell type:markdown id:894be984 tags:
## Single-output case
Let's look at an example with a single output model given by the equation below:
$y = x + 3\sin(x/2)$
%% Cell type:markdown id:ea09e63c tags:
### Import
First, we need to import the class `ScalarGaSP` from the module `psimpy.emulator.robustgasp`.
%% Cell type:code id:17e42357 tags:
``` python
from psimpy.emulator.robustgasp import ScalarGaSP
```
%% Cell type:markdown id:6e0c72a7 tags:
### Instantiation
Then we need to create a new instance of `ScalarGaSP`. In order to initialize `ScalarGaSP`, training input points `design` and corresponding training outputs `response` need to be defined. One can also specify optional arguments such as `trend`, `nugget_est`, `method`, `kernel_type`, etc.
Below we create a new instance of `ScalarGaSP` using six selected points from $y = x + 3\sin(x/2)$.
%% Cell type:code id:093b140d tags:
``` python
import math
import numpy as np
def f(x):
return x + 3*np.sin(x/2)
x = np.arange(0,16,3)
y = f(x)
emulator = ScalarGaSP(design=x, response=y)
```
%% Cell type:markdown id:4832524f tags:
### Training
Next, we need to train the `ScalarGaSP` emulator.
%% Cell type:code id:f70b21ab tags:
``` python
emulator.train()
```
%% Output
The upper bounds of the range parameters are 3839.746
The initial values of range parameters are 76.79491
Start of the optimization 1 :
The number of iterations is 7
The value of the marginal posterior function is -11.13375
Optimized range parameters are 11.02656
Optimized nugget parameter is 0
Convergence: TRUE
The initial values of range parameters are 0.8333333
Start of the optimization 2 :
The number of iterations is 9
The value of the marginal posterior function is -11.13375
Optimized range parameters are 11.02656
Optimized nugget parameter is 0
Convergence: TRUE
%% Cell type:markdown id:27c8db21 tags:
### Validation
We can validate the performance of the trained emulator using additional validation data. In case that
there is no extra computational budget to generate validation data, the leave-one-out cross validation
can be used. It is implemented by the `loo_validate()` method.
%% Cell type:code id:f602b061 tags:
``` python
predictions = emulator.loo_validate()
```
%% Cell type:markdown id:2df7e4d9 tags:
Let's ilustraste emulator predictions vs actual outputs:
%% Cell type:code id:ee605292 tags:
``` python
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(4,4))
ax.plot([np.min(y)-1,np.max(y)+1], [np.min(y)-1,np.max(y)+1])
ax.errorbar(y,predictions[:,0],predictions[:,1],fmt='.', linestyle='')
ax.set_xlabel('Actual output')
ax.set_ylabel('Emulator prediction')
```
%% Output
Text(0, 0.5, 'Emulator prediction')
%% Cell type:markdown id:ebcd4774 tags:
### Prediction
With the trained emulator at our deposit, we can use the `predict()` method to make predictions at
any arbitrary set of input points `testing_input`. The `testing_trend` should be set according to
the `trend` used for emulator training.
The `predict()` method returns a matrix consisting of four columns:
- predictions[:,0] - mean
- predictions[:,1] - lower bound of the 95% confidence interval
- predictions[:,2] - upper bound of the 95% confidence interval
- predictions[:,3] - standard deviation
%% Cell type:code id:a4fd6bbc tags:
``` python
testing_input = np.arange(0,15,0.1)
predictions = emulator.predict(testing_input)
plt.plot(testing_input, predictions[:, 0], 'r-', label= "mean")
plt.fill_between(testing_input, predictions[:, 1], predictions[:, 2], alpha=0.3)
plt.xlabel('x')
plt.ylabel('emulator prediction')
_ = plt.legend()
```
%% Output
%% Cell type:markdown id:20c9346b tags:
### Sampling
We can also draw any number of samples at the desirable input points `testing_input` using the `sample()` method.
%% Cell type:code id:baf358cc tags:
``` python
num_sample = 1
samples = emulator.sample(testing_input, num_sample=num_sample)
for i in range(num_sample):
plt.plot(testing_input, samples[:,i], label=f'sample{i+1}')
plt.xlabel('x')
plt.ylabel('y')
_ = plt.legend()
```
%% Output
%% Cell type:markdown id:6e078033 tags:
## Multi-output case
For the multi-output case, let's have a look at the data set from the `DIAMOND` simulator included in the `RobustGaSP` package.
%% Cell type:code id:daa3cefc tags:
``` python
humanityX = np.genfromtxt('../../tests/data/humanityX.csv', delimiter=',')
humanityXt = np.genfromtxt('../../tests/data/humanityXt.csv', delimiter=',')
humanityY = np.genfromtxt('../../tests/data/humanityY.csv', delimiter=',')
```
%% Cell type:markdown id:d76e05c9 tags:
### Import
First, we need to import the class `PPGaSP` from the module `psimpy.emulator.robustgasp`.
%% Cell type:code id:3267da6c tags:
``` python
from psimpy.emulator.robustgasp import PPGaSP
```
%% Cell type:markdown id:26d1a75f tags:
### Instantiation
Then we need to create a new instance of `PPGaSP` similar as what we did in the single output case.
%% Cell type:code id:ca25ee2c tags:
``` python
emulator = PPGaSP(design=humanityX, response=humanityY)
```
%% Cell type:markdown id:5b6211b5 tags:
### Training
Next, we need to train the `PPGaSP` emulator.
%% Cell type:code id:859f5799 tags:
``` python
emulator.train()
```
%% Output
The upper bounds of the range parameters are 296.974 297.1814 294.7672 295.9345 296.563 295.935 297.3198 296.4444 296.9323 298.2373 298.0619 298.7249 298.7249
The initial values of range parameters are 5.93948 5.943628 5.895344 5.918691 5.93126 5.918701 5.946395 5.928888 5.938645 5.964745 5.961238 5.974498 5.974498
Start of the optimization 1 :
The number of iterations is 44
The value of the marginal posterior function is -5279.534
Optimized range parameters are 25.46132 2.891428 5.30812 26.95519 291.1828 48.17274 84.56133 2.874385 39.12121 51.81304 0.5407651 1.728696 1.020605
Optimized nugget parameter is 0
Convergence: TRUE
The initial values of range parameters are 12.37503 12.38367 12.28307 12.33172 12.3579 12.33174 12.38944 12.35296 12.37329 12.42767 12.42037 12.44799 12.44799
Start of the optimization 2 :
The number of iterations is 45
The value of the marginal posterior function is -5279.534
Optimized range parameters are 25.46132 2.891428 5.308121 26.95519 291.183 48.17274 84.56133 2.874385 39.12121 51.81304 0.5407651 1.728696 1.020605
Optimized nugget parameter is 0
Convergence: TRUE
%% Cell type:markdown id:5ffec175 tags:
### Prediction
Now using the `predict()` method, we can make predictions for any arbitrary set of input points `testing_input`.
%% Cell type:code id:2cfd8d13 tags:
``` python
predictions = emulator.predict(humanityXt)
```
%% Cell type:markdown id:1ea6b470 tags:
### Validation
We can validate the performance of the trained emulator based on the validation data.
%% Cell type:code id:0fc7a66a tags:
``` python
humanityYt = np.genfromtxt('../../tests/data/humanityYt.csv', delimiter=',')
fig, ax = plt.subplots(figsize=(4,4))
ax.plot(humanityYt, predictions[:, :, 0], 'o')
ax.set_xlabel('Actual output')
_ = ax.set_ylabel('Emulator prediction')
```
%% Output
%% Cell type:markdown id:b9c199fa tags:
### Sampling
Using the `sample()` method, we can draw any number of samples at the desirable input points `testing_input`.
%% Cell type:code id:c1c51f8a tags:
``` python
samples = emulator.sample(humanityXt, num_sample=2)
```
%% Cell type:markdown id:74f2899d tags:
# Run Simulator
This example demonstrates how to execute a simulator (in serial or parallel) at a set of samples.
%% Cell type:markdown id:9a8d41a8 tags:
### Import
First, we need to import the class `RunSimulator` from the module `psimpy.simulator.run_simulator`.
%% Cell type:code id:727743d3 tags:
``` python
from psimpy.simulator.run_simulator import RunSimulator
```
%% Cell type:markdown id:5b18c1d6 tags:
### Instantiation
Then we need to create a new class that inherets from the class `RunSimulator` and override the abstract method `simulator()`. The `simulator()` method cannot have any positional-only arguments, has to contain the `**kwargs` argument and it has to return the results in form of a numpy array.
Here we're going to use the method `run` of the class `MPM` from the module `psimpy.simulator.mass_point_model` as our simulator. This method solves the mass point model using the ode solver `scipy.integrate.ode` given following inputs: topography, friction parameters, and initial condition of the mass point (location and velocity).
To do this, we first define the child class `MpmSolver`, override the `__init__` method to create a new instance of `MPM`, then we override the `simulator()` method in order to return the outputs of the method `run()` from the `MPM` class.
%% Cell type:code id:fb475422 tags:
``` python
from psimpy.simulator.mass_point_model import MPM
class MpmSolver(RunSimulator):
def __init__(self, *args, **kwargs):
self.mpm = MPM()
super(MpmSolver, self).__init__(*args, **kwargs)
def simulator(self, elevation, coulomb_friction, turbulent_friction, x0, y0,
ux0=0, uy0=0, dt=1, tend=300, t0=0, g=9.8, atol=1e-4, rtol=1e-4,
curvature=False, **kwargs):
return self.mpm.run(elevation=elevation, coulomb_friction=coulomb_friction,
turbulent_friction=turbulent_friction, x0=x0, y0=y0, ux0=ux0,
uy0=uy0, dt=dt, tend=tend, t0=t0, g=g, atol=atol, rtol=rtol,
curvature=curvature)
```
%% Cell type:markdown id:3546d6b4 tags:
Now we can create a new instance of `MpmSolver`. In order to initialize `MpmSolver`, which has the same `__init__()` method as `RunSimulator`, the user needs to define the parent folder to save simulation outputs `sim_dir`, a list consists of all variable input names `var_inp_name` and a dictionary consists of all fixed input name-value pairs `fix_inp`.
%% Cell type:code id:bd2936f9 tags:
``` python
import os
sim_dir = os.path.join(os.path.abspath(''), "../../tests/sim_log")
fix_inp = {
"elevation" : os.path.join(os.path.abspath(''), "../../tests/data/mpm_topo.asc"),
"x0" : 800,
"y0" : 2000,
"ux0" : 1e-10,
"uy0" : 1e-10,
"dt" : 2,
"tend" : 600
}
var_inp_name=["coulomb_friction", "turbulent_friction"]
mpmsolver = MpmSolver(sim_dir=sim_dir, var_inp_name=var_inp_name, fix_inp=fix_inp)
```
%% Cell type:markdown id:b1535843 tags:
`coulomb_friction` and `turbulent_friction` are variable inputs and the other inputs are fixed. We may vary `coulomb_friction` between 0.1 to 0.3 and `turbulent_friction` between 500 to 2000.
%% Cell type:code id:6e2270ee tags:
``` python
import numpy as np
import itertools
cf = np.arange(0.1, 0.3, 0.05)
tf = np.arange(500, 2000, 150)
var_samples = np.array([x for x in itertools.product(cf, tf)])
```
%% Cell type:markdown id:8f7da917 tags:
## Serial Execution
In order to perform a serial execution of the simulator, we need to pass `var_samples` to the `serial_run()` method. The user can also specify optional arguments like `prefixes`, `append` or `save_out`. Here we like to save the results with the "serial" prefixes.
%% Cell type:code id:31f7328b tags:
``` python
serial_prefixes = ["serial"+str(i) for i in range(len(var_samples))]
mpmsolver.serial_run(var_samples=var_samples, prefixes=serial_prefixes, save_out=True)
```
%% Cell type:markdown id:8149f7b0 tags:
To check the results one could either check `mpmsolver.outputs` or load the files available in `sim_dir`.
## Parallel Execution
Now let's run the same simulator in parallel. Once again we prepare samples of the variable inputs.
%% Cell type:code id:d5a94bbf tags:
``` python
cf = np.arange(0.1, 0.30, 0.04)
tf = np.arange(500, 2000, 100)
var_samples = np.array([x for x in itertools.product(cf, tf)])
```
%% Cell type:markdown id:c87aeaf8 tags:
This time we will pass the first 40 samples and then append the rest. We can also specify the maximum number of tasks running in parallel `max_workers` as well as `prefixes`, `append` and `save_out`.
%% Cell type:code id:9e772a13 tags:
``` python
mpmsolver.parallel_run(var_samples=var_samples[:40], max_workers=2)
```
%% Cell type:markdown id:f05e6619 tags:
Now, let's append the rest of the samples.
%% Cell type:code id:127faf15 tags:
``` python
mpmsolver.parallel_run(var_samples=var_samples[40:], append=True, max_workers=2)
```
%% Cell type:markdown id:6e83cd2f tags:
Same as serial execution, we can check the results either by using `mpmsolver.outputs` or loading the files available in `sim_dir`.
%% Cell type:markdown id:1a9ca136 tags:
# Latin Hypercube Sampling
This example demonstrates how to draw samples using the Latin hypercube sampler.
%% Cell type:markdown id:deb749b4 tags:
## Import
First, we need to import the class `LHS` from the module `psimpy.sampler.latin`.
%% Cell type:code id:4c78bba4-eb05-4ddb-a425-ba2ebe43048a tags:
``` python
from psimpy.sampler.latin import LHS
```
%% Cell type:markdown id:23866440 tags:
## Instantiation
Then we need to create a new instance of `LHS`. In order to initialize `LHS`,
the user needs to define the dimension of parameters `d`. The user can also
specify optional arguments including `bounds`, `seed`, and `criterion`.
Below we create a new instance of `LHS` for a two-dimensional problem, with the
other three arguments taking default values.
%% Cell type:code id:120945ce tags:
``` python
sampler = LHS(2)
```
%% Cell type:markdown id:63617382 tags:
## Draw samples
Now that we created our sampler, we can draw any number of samples using
the method `sample()`. Below 15 samples from the two-dimensional parameter space
are drawn.
%% Cell type:code id:f63fea15 tags:
``` python
samples = sampler.sample(15)
samples
```
%% Output
array([[0.77079997, 0.26794708],
[0.55721183, 0.52528333],
[0.9769387 , 0.25009117],
[0.34294648, 0.34372567],
[0.153813 , 0.58300717],
[0.61467445, 0.99535443],
[0.41090769, 0.03218058],
[0.09325676, 0.68595842],
[0.05398797, 0.89904516],
[0.90967252, 0.16312322],
[0.69818303, 0.10100304],
[0.47076742, 0.62137445],
[0.27380325, 0.43684061],
[0.25991701, 0.84513203],
[0.85445131, 0.77589623]])
%% Cell type:markdown id:30e22288 tags:
## Plot
Let's illustrate the samples using matplotlib library.
%% Cell type:code id:faabd74b tags:
``` python
import matplotlib.pyplot as plt
plt.scatter(samples[:, 0], samples[:, 1])
plt.title("Latin hypercube samples")
plt.show()
```
%% Output
%% Cell type:markdown id:9d3726e8 tags:
## Optional arguments
Now let's look at a second example.<br>
Here we like to pass specific `seed`, `bounds`, and `criterion` when we create
the sampler. If we choose 'maximin' as the criterion, we can specify the number
of itrations when we call the `sample()` method.
%% Cell type:code id:ad6fbc68 tags:
``` python
from psimpy.sampler.latin import LHS
import matplotlib.pyplot as plt
sampler = LHS(2, bounds=[[1, 20], [0, 2]], seed=123, criterion='maximin')
samples = sampler.sample(40, iteration=1000)
plt.scatter(samples[:, 0], samples[:, 1])
plt.title('Latin hypercube samples')
plt.show()
```
%% Output
%% Cell type:markdown id:3bae4a70 tags:
# Saltelli's extension of the Sobol sequence
This example demonstrates how to draw samples using Saltelli's extension of
the Sobol sequence.
%% Cell type:markdown id:894be984 tags:
## Import
First, we need to import the class `Saltelli` from the module `psimpy.sampler.saaltelli`.
%% Cell type:code id:42cb1b83 tags:
``` python
from psimpy.sampler.saltelli import Saltelli
```
%% Cell type:markdown id:6e0c72a7 tags:
## Instantiation
Then we need to create a new instance of `Saltelli`. In order to initialize `Saltelli`,
the user needs to define the dimension of parameters `d`. The user can also
specify the optional argument `bounds`.
Below we create a new instance of `Saltelli` for a two-dimensional problem, with
`bounds` set to default.
%% Cell type:code id:7ad79500 tags:
``` python
sampler = Saltelli(2)
```
%% Cell type:markdown id:4832524f tags:
## Draw samples
Now that we created our sampler, we can draw samples by calling the method `sample()`.
`sample()` has one mandatory argument `nsamples` and two optional arguments `calc_second_order`
and `skip_values`.
We draw samples by setting `nsamples=4` and keep the two optional arguments as default values
as follows.
%% Cell type:code id:f70b21ab tags:
``` python
samples = sampler.sample(4)
samples
```
%% Output
array([[0.09375, 0.46875],
[0.46875, 0.46875],
[0.09375, 0.65625],
[0.09375, 0.65625],
[0.46875, 0.46875],
[0.46875, 0.65625],
[0.59375, 0.96875],
[0.96875, 0.96875],
[0.59375, 0.15625],
[0.59375, 0.15625],
[0.96875, 0.96875],
[0.96875, 0.15625],
[0.84375, 0.21875],
[0.21875, 0.21875],
[0.84375, 0.90625],
[0.84375, 0.90625],
[0.21875, 0.21875],
[0.21875, 0.90625],
[0.34375, 0.71875],
[0.71875, 0.71875],
[0.34375, 0.40625],
[0.34375, 0.40625],
[0.71875, 0.71875],
[0.71875, 0.40625]])
%% Cell type:markdown id:27c8db21 tags:
## Plot
Let's illustrate the samples using matplotlib library.
%% Cell type:code id:f602b061 tags:
``` python
import matplotlib.pyplot as plt
plt.scatter(samples[:, 0], samples[:, 1])
plt.title("Saltelli samples")
plt.show()
```
%% Output
%% Cell type:markdown id:b418875d tags:
## Optional arguments
Now let's look at a second example.<br>
Here we have specific bounds for the two parameters and want to specify the `bounds`
when create the sampler. And we don't want to calculate the second order sensitivity
indices, so we set `calc_second_order` to False. We also specify the number of points
to skip `skit_values`in the Sobol' sequence. This number should be ideally a value of base 2.
%% Cell type:code id:a4fd6bbc tags:
``` python
from psimpy.sampler.saltelli import Saltelli
import matplotlib.pyplot as plt
sampler = Saltelli(2, bounds=[[0, 8], [0, 5]])
samples = sampler.sample(64, calc_second_order=False, skip_values=128)
plt.scatter(samples[:, 0], samples[:, 1])
plt.title('Saltelli samples')
plt.show()
```
%% Output
......@@ -6,7 +6,6 @@
:hidden:
changelog.md
contributing.md
conduct.md
autoapi/index
```
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment