GP with Basis Functions¶

Learn a function by specifiying an explicit set of basis function and model the residuals by a kernel.

class semigp.GP(theta: numpy.ndarray, y: numpy.ndarray, var: float = 1e-05, order: int = 2, x_trans: bool = False, y_trans: bool = False, jitter: float = 1e-10, use_mean: bool = False)[source]¶

Bases: object

Module to learn a function which maps the inputs to the output. There are various important aspects in having a semi-parameteric Gaussian Process model. The parameteric part here is a polynomial function. Only order = 1 and order = 2 are currently supported. In addition, we also use a pre-whitening step at the input level and the code also supports log_10 transformation for the targets.

Param: theta (np.ndarray) : matrix of size ntrain x ndim
Param: y (np.ndarray) : output/target
Param: var (float or np.ndarray) : noise covariance matrix of size ntrain x ntrain
Param: x_trans (bool) : if True, pre-whitening is applied
Param: y_trans (bool) : if True, log of output is used
Param: jitter (float) : a jitter term just to make sure all matrices are numerically stable
Param: use_mean (bool) : if True, the outputs are centred on zero

compute_basis(test_point: numpy.ndarray = None) → numpy.ndarray[source]¶

Compute the input basis functions

Param: test_point (np.ndarray) : if a test point is provdied, phi_star is calculated
Returns: phi or phi_star (np.ndarray) : the basis functions

delete_kernel() → None[source]¶: Deletes the kernel matrix from the GP module

derivatives(test_point: numpy.ndarray, order: int = 1) → Tuple[numpy.ndarray, numpy.ndarray][source]¶

If we did some transformation on the ouputs, we need this function to calculate the ‘exact’ gradient

Param: test_point (np.ndarray) : array of the test point
Param: order (int) : 1 or 2, referrring to first and second derivatives respectively
Returns: grad (np.ndarray) : first derivative with respect to the input parameters
Returns: gradient_sec (np.ndarray) : second derivatives with respect to the input parameters, if specified

do_transformation() → None[source]¶: Perform all transformations

evidence(params: numpy.ndarray) → Tuple[numpy.ndarray, numpy.ndarray][source]¶

Calculate the log-evidence of the GP

Param: params (np.ndarray) : kernel hyperparameters
Returns: neg_log_evidence (np.ndarray) : the negative log-marginal likelihood
Returns: -gradient (np.ndarray) : the gradient with respect to the kernel hyperparameters

fit(method: str = 'CG', bounds: numpy.ndarray = None, options: dict = {'ftol': 1e-05}, n_restart: int = 2) → numpy.ndarray[source]¶

The kernel hyperparameters are learnt in this function.

Param

method (str) : the choice of the optimizer:

https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html

Recommend L-BFGS-B algorithm

Param

bounds (np.ndarray) : the prior on these hyperparameters

Param

options (dictionary) : options for the L-BFGS-B optimizer. We have:

options={'disp': None,
        'maxcor': 10,
        'ftol': 2.220446049250313e-09,
        'gtol': 1e-05,
        'eps': 1e-08,
        'maxfun': 15000,
        'maxiter': 15000,
        'iprint': - 1,
        'maxls': 20,
        'finite_diff_rel_step': None}

Param: n_restart (int) : number of times we want to restart the optimizer
Returns: opt_params (np.ndarray) : array of the optimised kernel hyperparameters

grad_pre_computations(test_point: numpy.ndarray, order: int = 1) → Tuple[numpy.ndarray, numpy.ndarray][source]¶

Pre-compute some quantities prior to calculating the gradients

Param: test_point (np.ndarray) : test point in parameter space
Param: order (int) : order of differentiation (default: 1) - not to be confused with order of the polynomial
Returns: gradients (tuple) : first and second derivatives (if order = 2)

inv_noise_cov() → numpy.ndarray[source]¶

Calculate the inverse of the noise covariance matrix

Param: mat_inv (np.ndarray) : inverse of the noise covariance

inv_prior_cov() → numpy.ndarray[source]¶

Calculate the inverse of the prior covariance matrix

Returns: mat_inv (np.ndarray) : inverse of the prior covariance matrix (parametric part)

noise_covariance() → numpy.ndarray[source]¶

Build the noise covariance matrix

Returns: the initial pre-defined noise variance (either float or matrix)

posterior() → Tuple[numpy.ndarray, numpy.ndarray][source]¶

Computes the posterior distribution of beta and f (latent variables)

Note: Optimise for the kernel parameters first

Param: post_mean (np.ndarray) : mean posterior
Param: a_inv_matrix (np.ndarray) : covariance of all latent parameters
Returns: post_mean (np.ndarray) : mean of the regression coefficient and the residuals
Returns: a_inv_matrix (np.ndarray) : the full covariance matrix of teh estimated parameters

pred_original_function(test_point: numpy.ndarray, n_samples: int = None) → numpy.ndarray[source]¶

Calculates the original function if the log_10 transformation is used on the target.

Param: test_point (np.ndarray) - the test point in parameter space
Param: n_samples (int) - we can also generate samples of the function (assuming we have stored the Cholesky factor)
Returns: y_samples (np.ndarray) - if n_samples is specified, samples will be returned
Returns: y_original (np.ndarray) - the predicted function in the linear scale (original space) is returned

prediction(test_point: numpy.ndarray, return_var: bool = False) → Tuple[numpy.ndarray, numpy.ndarray][source]¶

Predicts the function at a test point in parameter space

Param: test_point (np.ndarray) : test point in parameter space
Param: return_var (bool) : if True, the predicted variance will be computed
Returns: mean_pred (np.ndarray) : the mean of the GP
Returns: var_pred (np.ndarray) : the variance of the GP (optional)

regression_prior(mean: numpy.ndarray = None, cov: numpy.ndarray = None, lambda_cap: float = 1) → None[source]¶

Specify the regression prior (mean and covariance)

Param: mean (np.ndarray) : default zeros
Param: cov (np.ndarray) : default identity matrix
Param: lambda_cap (float) : width of the prior covariance matrix (default 1)