Note
Click here to download the full example code
PPGaSP: GP emulation for multi-output functions
This example shows how to apply Gaussian process emulation to a multi-output
function using class PPGaSP
.
The multi-output function that we are going to look at is the DIAMOND (diplomatic and military operations in a non-warfighting domain) computer model. It is used as a testbed to illustrate the PP GaSP emulator in the R package RobustGaSP, see Gu et al. [2019] for more detail.
The simulator has \(13\) input parameters and \(5\) outputs. Namely, \(\mathbf{y}=f(\mathbf{x})\) where \(\mathbf{x}=(x_1,\ldots,x_{13})^T\) and \(\mathbf{y}=(y_1,\ldots,y_5)^T\).
The training and testing data are provided in the folder ‘…/tests/data/’. We first load the training and testing data.
import numpy as np
import os
dir_data = os.path.abspath('../../../tests/data/')
humanityX = np.genfromtxt(os.path.join(dir_data, 'humanityX.csv'), delimiter=',')
humanityY = np.genfromtxt(os.path.join(dir_data, 'humanityY.csv'), delimiter=',')
print(f"Number of training data points: ", humanityX.shape[0])
print(f"Input dimension: ", humanityX.shape[1])
print(f"Output dimension: ", humanityY.shape[1])
humanityXt = np.genfromtxt(os.path.join(dir_data, 'humanityXt.csv'), delimiter=',')
humanityYt = np.genfromtxt(os.path.join(dir_data, 'humanityYt.csv'), delimiter=',')
print(f"Number of testing data points: ", humanityXt.shape[0])
Number of training data points: 120
Input dimension: 13
Output dimension: 5
Number of testing data points: 120
Note
You may need to modify dir_data
according to where you save them
on your local machine.
humanityX
and humanitY
are the training data, corresponding to design
and response
respectively. humanityXt
are testing input data, at which
we are going to make predictions once the emulator is trained. humanityYt
are the true outputs at humanityXt
, which is then used to validate the
performance of the trained emulator.
To build a PP GaSP emulator for the above simulator, first import class
PPGaSP
by
from psimpy.emulator import PPGaSP
Then, create an instance of PPGaSP
. The parameter ndim
(dimension of function input x
) must be specified. Optional parameters, such
as method
, kernel_type
, etc., can be set up if desired. Here, we leave
all the optional parameters to their default values.
emulator = PPGaSP(ndim=humanityX.shape[1])
Next, we train the PP GaSP emulator based on the training data using
PPGaSP.train()
method.
emulator.train(design=humanityX, response=humanityY)
The upper bounds of the range parameters are 296,974 297,1814 294,7672 295,9345 296,563 295,935 297,3198 296,4444 296,9323 298,2373 298,0619 298,7249 298,7249
The initial values of range parameters are 5,93948 5,943628 5,895344 5,918691 5,93126 5,918701 5,946395 5,928888 5,938645 5,964745 5,961238 5,974498 5,974498
Start of the optimization 1 :
The number of iterations is 44
The value of the marginal posterior function is -5279,534
Optimized range parameters are 25,46132 2,891428 5,30812 26,95519 291,1828 48,17274 84,56133 2,874385 39,12121 51,81304 0,5407651 1,728696 1,020605
Optimized nugget parameter is 0
Convergence: TRUE
The initial values of range parameters are 12,37503 12,38367 12,28307 12,33172 12,3579 12,33174 12,38944 12,35296 12,37329 12,42767 12,42037 12,44799 12,44799
Start of the optimization 2 :
The number of iterations is 45
The value of the marginal posterior function is -5279,534
Optimized range parameters are 25,46132 2,891428 5,308121 26,95519 291,183 48,17274 84,56133 2,874385 39,12121 51,81304 0,5407651 1,728696 1,020605
Optimized nugget parameter is 0
Convergence: TRUE
With the trained emulator, we can make predictions for any arbitrary set of
input points using PPGaSP.predict()
method.
Here, we make predictions at testing input points humanityXt
.
predictions = emulator.predict(humanityXt)
We can validate the performance of the trained emulator based on the true outputs
humanityYt
.
import matplotlib.pyplot as plt
fig, ax = plt.subplots(5, 1, figsize=(6,15))
for i in range(humanityY.shape[1]):
ax[i].set_xlabel(f'Actual $y_{i+1}$')
ax[i].set_ylabel(f'Emulator-predicted $y_{i+1}$')
ax[i].set_xlim(np.min(humanityYt[:,i]),np.max(humanityYt[:,i]))
ax[i].set_ylim(np.min(humanityYt[:,i]),np.max(humanityYt[:,i]))
_ = ax[i].plot([np.min(humanityYt[:,i]),np.max(humanityYt[:,i])], [np.min(humanityYt[:,i]),np.max(humanityYt[:,i])])
_ = ax[i].errorbar(humanityYt[:,i], predictions[:,i,0], predictions[:,i,3], fmt='.', linestyle='', label='prediction and std')
_ = ax[i].legend()
plt.tight_layout()

We can also draw any number of samples at testing input humanityXt
using
PPGaSPsample()
method.
samples = emulator.sample(humanityXt, nsamples=10)
print("Shape of samples: ", samples.shape)
Shape of samples: (120, 5, 10)
Total running time of the script: ( 0 minutes 5.333 seconds)