diff --git a/docs/source/emulator/index.rst b/docs/source/emulator/index.rst
index 7d6d34d0e851f5d921e4ef74fe70b8a0699c359f..1ffa6d7964ca6912ba27df50e5ad4d4c3e946898 100644
--- a/docs/source/emulator/index.rst
+++ b/docs/source/emulator/index.rst
@@ -20,7 +20,7 @@ Currently implemented classes in this module are:
   multiple values as output. 
 
 .. toctree::
-   :maxdepth: 1
+   :maxdepth: 2
    :hidden:
    :caption: Emulator
 
diff --git a/docs/source/emulator/robustgasp.rst b/docs/source/emulator/robustgasp.rst
index 007587ac1d9a02da6c7fb02b12acec28b5148b3b..d5c587f6053101966beb1ca7b94205e259b9c268 100644
--- a/docs/source/emulator/robustgasp.rst
+++ b/docs/source/emulator/robustgasp.rst
@@ -2,13 +2,18 @@ RobustGaSP
 ==========
 
 `RobustGaSP`, standing for `Robust Gaussian Stochastic Process Emulation`, is a
-`R` package developed by :cite:t:`Gu2019`. In this module,
-we have implemented class :class:`.ScalarGaSP` and  class :class:`.PPGaSP` 
-based on `RobustGaSP` and `rpy2`, which allows us to use `RobustGaSP` directly
-from within `Python`.
+`R` package developed by :cite:t:`Gu2019`. It implements robust methods to
+estimate the unknown parameters which can greatly improve predictive performance
+of the built emulator. In addition, it implements the `PP GaSP` (parallel partial
+Gaussian stochastic process) emulator for simulators which return multiple values
+as output, see :cite:t:`Gu2016` for more detail.
 
-Theory of Gaussian process emulation
-------------------------------------
+In :py:mod:`PSimPy.emulator.robustgasp`, we have implemented class
+:class:`.ScalarGaSP` and  class :class:`.PPGaSP` based on `RobustGaSP` and
+`rpy2`, which allows us to use `RobustGaSP` directly from within `Python`.
+
+Theory recap of GP emulation
+----------------------------
 
 A simulator represents a mapping from an input space to an output space.
 Gaussian process emulation treats a computationally expensive simulator as an
@@ -19,8 +24,8 @@ can then be used for computationally demanding analyses :cite:p:`Zhao2021a`. A r
 of Gaussian process emulation is provided below, which is adapted from `Chapter 3`
 of :cite:t:`Zhao2021b`.
 
-Gaussian process prior
-^^^^^^^^^^^^^^^^^^^^^^
+GP prior
+^^^^^^^^
 
 Let :math:`f(\cdot)` denote a simulator that one wants to approximate. It
 defines a mapping from a :math:`p`-dimensional input
@@ -71,8 +76,8 @@ dimension. The unknowns in the covariance function
 :math:`\sigma^2` and the :math:`p` correlation length parameters
 :math:`\boldsymbol{\psi}=(\psi_1,\ldots,\psi_p)^T`.
 
-Gaussian process posterior
-^^^^^^^^^^^^^^^^^^^^^^^^^^
+GP posterior
+^^^^^^^^^^^^
 
 Equations :eq:`GP` to :eq:`GPcorrelation` encode prior knowledge of the simulator
 :math:`f(\cdot)`. From the Bayesian viewpoint, this prior knowledge can be updated
@@ -133,13 +138,44 @@ updated Gaussian process, given the unknown regression parameters
 parameters :math:`\boldsymbol{\psi}`. 
 
 For any new input :math:`\mathbf{x}^*`, the output :math:`y^*` can be almost
-instantaneously predicted by the mean :math:`m'`, namely equation :eq:`conditionalGaussianM`.
-Moreover, the uncertainty associated with the prediction can be estimated by
-for example the variance :math:`\sigma^2c'`. Depending on how the unknown parameters
-(:math:`\boldsymbol{\beta}`, :math:`\sigma^2`, and :math:`\boldsymbol{\psi}`)
+instantaneously predicted, given by the mean :math:`m'`, namely equation
+:eq:`conditionalGaussianM`. Moreover, the uncertainty associated with the prediction
+can be estimated by for example the variance :math:`\sigma^2c'`. Depending on how
+the unknown parameters (:math:`\boldsymbol{\beta}`, :math:`\sigma^2`, and :math:`\boldsymbol{\psi}`)
 are learned based on the training data, the posterior can have various forms, see
 `Section 3.3` of :cite:t:`Zhao2021b`.
 
+Sample from GP
+^^^^^^^^^^^^^^
+A Gaussian process, either the prior or the posterior, defines a distribution over
+functions. Once its mean function :math:`m(\cdot)` and covariance function
+:math:`C(\cdot,\cdot)` are specified, the Gaussian process can be evaluated at a
+finite number :math:`s` of input points :math:`\{\mathbf{x}_i\}_{i=1}^{s}`.
+The joint distribution of :math:`\{y_i=f(\mathbf{x}_i)\}_{i=1}^{s}` follows a
+:math:`s`-variate Gaussian distribution. A sample of the :math:`s`-variate Gaussian
+distribution, denoted as :math:`\tilde{\mathbf{y}}:=(\tilde{y}_1,\ldots,\tilde{y}_s)^T`,
+can be viewed as a sample of the Gaussian process (evaluated at a finite number of points).
+To generate :math:`\tilde{\mathbf{y}}`, the :math:`s \times s` covariance matrix
+:math:`\boldsymbol{\Sigma}` needs to be decomposed into the product of a lower triangular
+matrix :math:`\mathbf{L}` and its conjugate transpose :math:`\mathbf{L}^T` using the Cholesky
+decomposition
+
+.. math:: \boldsymbol{\Sigma}=\left(\begin{array}{cccc}
+    C\left(\mathbf{x}_{1}, \mathbf{x}_{1}\right) & C\left(\mathbf{x}_{1}, \mathbf{x}_{2}\right) & \ldots & C\left(\mathbf{x}_{1}, \mathbf{x}_{s}\right) \\
+    C\left(\mathbf{x}_{2}, \mathbf{x}_{1}\right) & C\left(\mathbf{x}_{2}, \mathbf{x}_{2}\right) & \ldots & C\left(\mathbf{x}_{2}, \mathbf{x}_{s}\right) \\
+    \vdots & \vdots & \ddots & \vdots \\
+    C\left(\mathbf{x}_{s}, \mathbf{x}_{1}\right) & \ldots & \ldots & C\left(\mathbf{x}_{s}, \mathbf{x}_{s}\right)
+    \end{array}\right)=\mathbf{L}\mathbf{L}^T.
+    :label: Choleskydecomposition
+
+Then a sample :math:`\tilde{\mathbf{y}}` can be obtained as follows:
+
+.. math:: \tilde{\mathbf{y}} = (m(\mathbf{x}_i),\ldots,m(\mathbf{x}_s))^T + \mathbf{L}\mathbf{w},
+    :label: GPsample
+
+where :math:`\mathbf{w}:=(w_1,\dots,w_s)^T` denotes a :math:`s`-dimensional random vector
+consisting of :math:`s` independent standard normal random variables :math:`w_i,i \in \{1,\ldots,s\}`.
+
 
 ScalarGaSP Class
 ----------------
@@ -172,4 +208,4 @@ It is imported by::
 Methods
 ^^^^^^^
 .. autoclass:: psimpy.emulator.robustgasp.PPGaSP
-    :members: train, predict, sample
\ No newline at end of file
+    :members: train, predict, sample