GP with Basis Functions

Learn a function by specifiying an explicit set of basis function and model the residuals by a kernel.

class semigp.GP(theta: numpy.ndarray, y: numpy.ndarray, var: float = 1e-05, order: int = 2, x_trans: bool = False, y_trans: bool = False, jitter: float = 1e-10, use_mean: bool = False)[source]

Bases: object

Module to learn a function which maps the inputs to the output. There are various important aspects in having a semi-parameteric Gaussian Process model. The parameteric part here is a polynomial function. Only order = 1 and order = 2 are currently supported. In addition, we also use a pre-whitening step at the input level and the code also supports log_10 transformation for the targets.

Param

theta (np.ndarray) : matrix of size ntrain x ndim

Param

y (np.ndarray) : output/target

Param

var (float or np.ndarray) : noise covariance matrix of size ntrain x ntrain

Param

x_trans (bool) : if True, pre-whitening is applied

Param

y_trans (bool) : if True, log of output is used

Param

jitter (float) : a jitter term just to make sure all matrices are numerically stable

Param

use_mean (bool) : if True, the outputs are centred on zero

compute_basis(test_point: numpy.ndarray = None) → numpy.ndarray[source]

Compute the input basis functions

Param

test_point (np.ndarray) : if a test point is provdied, phi_star is calculated

Returns

phi or phi_star (np.ndarray) : the basis functions

delete_kernel() → None[source]

Deletes the kernel matrix from the GP module

derivatives(test_point: numpy.ndarray, order: int = 1) → Tuple[numpy.ndarray, numpy.ndarray][source]

If we did some transformation on the ouputs, we need this function to calculate the ‘exact’ gradient

Param

test_point (np.ndarray) : array of the test point

Param

order (int) : 1 or 2, referrring to first and second derivatives respectively

Returns

grad (np.ndarray) : first derivative with respect to the input parameters

Returns

gradient_sec (np.ndarray) : second derivatives with respect to the input parameters, if specified

do_transformation() → None[source]

Perform all transformations

evidence(params: numpy.ndarray) → Tuple[numpy.ndarray, numpy.ndarray][source]

Calculate the log-evidence of the GP

Param

params (np.ndarray) : kernel hyperparameters

Returns

neg_log_evidence (np.ndarray) : the negative log-marginal likelihood

Returns

-gradient (np.ndarray) : the gradient with respect to the kernel hyperparameters

fit(method: str = 'CG', bounds: numpy.ndarray = None, options: dict = {'ftol': 1e-05}, n_restart: int = 2) → numpy.ndarray[source]

The kernel hyperparameters are learnt in this function.

Param

method (str) : the choice of the optimizer:

https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html

Recommend L-BFGS-B algorithm

Param

bounds (np.ndarray) : the prior on these hyperparameters

Param

options (dictionary) : options for the L-BFGS-B optimizer. We have:

options={'disp': None,
        'maxcor': 10,
        'ftol': 2.220446049250313e-09,
        'gtol': 1e-05,
        'eps': 1e-08,
        'maxfun': 15000,
        'maxiter': 15000,
        'iprint': - 1,
        'maxls': 20,
        'finite_diff_rel_step': None}
Param

n_restart (int) : number of times we want to restart the optimizer

Returns

opt_params (np.ndarray) : array of the optimised kernel hyperparameters

grad_pre_computations(test_point: numpy.ndarray, order: int = 1) → Tuple[numpy.ndarray, numpy.ndarray][source]

Pre-compute some quantities prior to calculating the gradients

Param

test_point (np.ndarray) : test point in parameter space

Param

order (int) : order of differentiation (default: 1) - not to be confused with order of the polynomial

Returns

gradients (tuple) : first and second derivatives (if order = 2)

inv_noise_cov() → numpy.ndarray[source]

Calculate the inverse of the noise covariance matrix

Param

mat_inv (np.ndarray) : inverse of the noise covariance

inv_prior_cov() → numpy.ndarray[source]

Calculate the inverse of the prior covariance matrix

Returns

mat_inv (np.ndarray) : inverse of the prior covariance matrix (parametric part)

noise_covariance() → numpy.ndarray[source]

Build the noise covariance matrix

Returns

the initial pre-defined noise variance (either float or matrix)

posterior() → Tuple[numpy.ndarray, numpy.ndarray][source]

Computes the posterior distribution of beta and f (latent variables)

Note: Optimise for the kernel parameters first

Param

post_mean (np.ndarray) : mean posterior

Param

a_inv_matrix (np.ndarray) : covariance of all latent parameters

Returns

post_mean (np.ndarray) : mean of the regression coefficient and the residuals

Returns

a_inv_matrix (np.ndarray) : the full covariance matrix of teh estimated parameters

pred_original_function(test_point: numpy.ndarray, n_samples: int = None) → numpy.ndarray[source]

Calculates the original function if the log_10 transformation is used on the target.

Param

test_point (np.ndarray) - the test point in parameter space

Param

n_samples (int) - we can also generate samples of the function (assuming we have stored the Cholesky factor)

Returns

y_samples (np.ndarray) - if n_samples is specified, samples will be returned

Returns

y_original (np.ndarray) - the predicted function in the linear scale (original space) is returned

prediction(test_point: numpy.ndarray, return_var: bool = False) → Tuple[numpy.ndarray, numpy.ndarray][source]

Predicts the function at a test point in parameter space

Param

test_point (np.ndarray) : test point in parameter space

Param

return_var (bool) : if True, the predicted variance will be computed

Returns

mean_pred (np.ndarray) : the mean of the GP

Returns

var_pred (np.ndarray) : the variance of the GP (optional)

regression_prior(mean: numpy.ndarray = None, cov: numpy.ndarray = None, lambda_cap: float = 1) → None[source]

Specify the regression prior (mean and covariance)

Param

mean (np.ndarray) : default zeros

Param

cov (np.ndarray) : default identity matrix

Param

lambda_cap (float) : width of the prior covariance matrix (default 1)