Gaussian Process Acquisition#
This module implements tools for active sampling with Gaussian Process surrogate models.
BatchOptimizer acquisition engine#
NORA acquisition engine#
[TODO]
GPAcquisition classes, which take care of proposing new locations where to evaluate the true function.
- gp_acquisition.builtin_names()[source]#
Lists all names of all built-in acquisition functions criteria.
- class gp_acquisition.GenericGPAcquisition(bounds, preprocessing_X=None, verbose=1, acq_func='LogExp')[source]#
Bases:
objectGeneric class for acquisition objects.
- multi_add(gpr, n_points=1, bounds=None, rng=None)[source]#
Method to query multiple points where the objective function shall be evaluated.
The strategy differs depending on the acquisition class.
When run in parallel (MPI), it must return the same values for all processes.
- Parameters:
gpr (GaussianProcessRegressor) – The GP Regressor which is used as surrogate model.
n_points (int, optional (default=1)) – Number of points to be returned. A value large than 1 is useful if you can evaluate your objective in parallel, and thus obtain more objective function evaluations per unit of time.
bounds (np.array, optional) – Bounds inside which to look for the next proposals, e.g. the GPR trust region. If not defined, the prior bounds are used.
rng (int or numpy.random.Generator, optional) – The generator used to perform the acquisition process. If an integer is given, it is used as a seed for the default global numpy random number generator.
- Returns:
X (numpy.ndarray, shape = (X_dim, n_points)) – The X values of the found optima
y_lies (numpy.ndarray, shape = (n_points,)) – The predicted values of the GP at the proposed sampling locations
fval (numpy.ndarray, shape = (n_points,)) – The values of the acquisition function at X_opt
- class gp_acquisition.BatchOptimizer(bounds, preprocessing_X=None, verbose=1, acq_func='LogExp', proposer=None, acq_optimizer='fmin_l_bfgs_b', n_restarts_optimizer='5d', n_repeats_propose=10)[source]#
Bases:
GenericGPAcquisitionRun Gaussian Process acquisition.
Works similarly to a GPRegressor but instead of optimizing the kernel’s hyperparameters it optimizes the Acquisition function in order to find one or multiple points at which the likelihood/posterior should be evaluated next.
Furthermore contains a framework for different lying strategies in order to improve the performance if multiple processors are available
Use this class directly if you want to control the iterations of your bayesian quadrature loop.
- Parameters:
bounds (array) – Bounds in which to optimize the acquisition function, assumed to be of shape (d,2) for d dimensional prior
proposer (Proposer object, optional (default: "ParialProposer", producing a mixture) – of points drawn from an “UniformProposer” and from a “CentroidsProposer”) Proposes points from which the acquisition function should be optimized.
acq_func (GPry Acquisition Function, dict, optional (default: "LogExp")) – Acquisition function to maximize/minimize. If none is given the LogExp acquisition function will be used. Can also be a dictionary with the name of the acquisition function as the single key, and as value a dict of its arguments.
acq_optimizer (string or callable, optional (default: "auto")) –
Can either be one of the internally supported optimizers for optimizing the acquisition function, specified by a string, or an externally defined optimizer passed as a callable. If a callable is passed, it must have the signature:
def optimizer(obj_func, initial_guess, bounds): # * 'obj_func' is the objective function to be maximized, which # takes the hyperparameters theta as parameter and an # optional flag eval_gradient, which determines if the # gradient is returned additionally to the function value # * 'initial_guess': the initial value for X, which can be # used by local optimizers # * 'bounds': the bounds on the values of X .... # Returned are the best found X and # the corresponding value of the target function. return X_opt, func_min
if set to ‘auto’ either the ‘fmin_l_bfgs_b’ or ‘sampling’ algorithm from scipy.optimize is used depending on whether gradient information is available or not.
Note
The default optimizers are designed to maximize the acquisition function.
preprocessing_X (X-preprocessor, Pipeline_X, optional (default: None)) – Single preprocessor or pipeline of preprocessors for X. Preprocessing makes sense if the scales along the different dimensions are vastly different which means that the optimizer struggles to find the maximum of the acquisition function. If None is passed the data is not preprocessed.
n_restarts_optimizer (int, default=0) –
The number of restarts of the optimizer for finding the maximum of the acquisition function. The first run of the optimizer is performed from the last X fit to the model if available, otherwise it is drawn at random.
The remaining ones (if any) from X’s sampled uniform randomly from the space of allowed X-values. Note that n_restarts_optimizer == 0 implies that one run is performed.
verbose (1, 2, 3, optional (default: 1)) – Level of verbosity. 3 prints Infos, Warnings and Errors, 2 Warnings and Errors, and 1 only Errors. Should be set to 2 or 3 if problems arise.
- Attributes:
gpr_ (GaussianProcessRegressor) – The GP Regressor which is currently used for optimization.
- optimize_acquisition_function(gpr, i, bounds=None, rng=None)[source]#
Exposes the optimization method for the acquisition function. When called it proposes a single point where for where to evaluate the true model next. It is internally called in the
multi_add()method.- Parameters:
gpr (GaussianProcessRegressor) – The GP Regressor which is used as surrogate model.
i (int) – Internal counter which is used to enable MPI support. If you want to optimize from a single location and rerun the optimizer from multiple starting locations loop over this parameter.
rng (numpy.random.Generator, optional) – The generator used for the optimization process.
- Returns:
X_opt (numpy.ndarray, shape = (X_dim,)) – The X value of the found optimum
func (float) – The value of the acquisition function at X_opt
- multi_add(gpr, n_points=1, bounds=None, rng=None, force_resample=False)[source]#
Method to query multiple points where the objective function shall be evaluated. The strategy which is used to query multiple points is by using the \(f(x)\sim \mu(x)\) strategy and and not changing the hyperparameters of the model.
This is done to increase speed since then the blockwise matrix inversion lemma can be used to invert the K matrix. The optimization for a single point is done using the
optimize_acquisition_func()method.When run in parallel (MPI), returns the same values for all processes.
- Parameters:
gpr (GaussianProcessRegressor) – The GP Regressor which is used as surrogate model.
n_points (int, optional (default=1)) – Number of points to be returned. A value large than 1 is useful if you can evaluate your objective in parallel, and thus obtain more objective function evaluations per unit of time.
bounds (np.array, optional) – Bounds inside which to look for the next proposals, e.g. the GPR trust region. If not defined, the prior bounds are used.
rng (int or numpy.random.Generator, optional) – The generator used to perform the acquisition process. If an integer is given, it is used as a seed for the default global numpy random number generator.
- Returns:
X (numpy.ndarray, shape = (X_dim, n_points)) – The X values of the found optima
y_lies (numpy.ndarray, shape = (n_points,)) – The predicted values of the GP at the proposed sampling locations
fval (numpy.ndarray, shape = (n_points,)) – The values of the acquisition function at X_opt
- class gp_acquisition.NORA(bounds, preprocessing_X=None, verbose=1, acq_func='LogExp', sampler=None, mc_every='1d', nlive_per_training=3, nlive_max='25d', nlive_per_dim_max=None, num_repeats='5d', num_repeats_per_dim=None, precision_criterion_target=0.01, nprior_per_nlive=10, max_ncalls=None, tmpdir=None)[source]#
Bases:
GenericGPAcquisitionRun Gaussian Process acquisition with NORA (Nested sampling Optimization for Ranked Acquistion).
Uses kriging believer while it samples the acquisition function using nested sampling (with PolyChord or UltraNest).
- Parameters:
bounds (array) – Bounds in which to optimize the acquisition function, assumed to be of shape (d,2) for d dimensional prior
acq_func (GPry Acquisition Function, dict, optional (default: "LogExp")) – Acquisition function to maximize/minimize. If none is given the LogExp acquisition function will be used. Can also be a dictionary with the name of the acquisition function as the single key, and as value a dict of its arguments.
mc_every (int) – If >1, only calls the MC sampler every mc_steps, and reuses previous X otherwise, recomputing y and sigma with the new GPR.
nlive_per_training (int) – live points per sample in the current training set. Not recommended to decrease it.
nlive_max (int) – live points max cap
num_repeats (int) – length of slice-chains
precision_criterion_target (float) – Cap on precision criterion of Nested Sampling
nprior_per_nlive (int) – Number of initial samples times dimension.
preprocessing_X (X-preprocessor, Pipeline_X, optional (default: None)) – Single preprocessor or pipeline of preprocessors for X. Preprocessing makes sense if the scales along the different dimensions are vastly different which means that the optimizer struggles to find the maximum of the acquisition function. If None is passed the data is not preprocessed.
verbose (1, 2, 3, optional (default: 1)) – Level of verbosity. 3 prints Infos, Warnings and Errors, 2 Warnings and Errors, and 1 only Errors. Should be set to 2 or 3 if problems arise.
- Attributes:
gpr_ (GaussianProcessRegressor) – The GP Regressor which is currently used for optimization.
- property pool_size#
Size of the pool of points.
- update_NS_precision(gpr)[source]#
Updates NS precision parameters: - num_repeats: constant for now - nlive: nlive_per_training times the size of the training set, capped at
nlive_max (typically 25 * dimension).
precision_criterion: constant for now.
- log(msg, level=None)[source]#
Print a message if its verbosity level is equal or lower than the given one (or always if
level=None.
- do_MC_sample(gpr, bounds, rng=None, sampler=None)[source]#
- Returns:
May return None for any of y, sigma_y, weights
- Return type:
X, y, sigma_y, weights
- last_MC_sample(copy=False, warn_reweight=True)[source]#
Returns the last MC sample as
(X, y, sigma_y, weights).y, sigma_ymay be None if not computed while sampling. They can be generated with the gpr. Ifweightsis None, all samples should be assumed to have equal weights.Prints a warning if it is a reweighted sample.
- last_MC_sample_getdist(params, warn_reweight=True)[source]#
Returns the last MC sample as a
getdist.MCSamplesinstance.Prints a warning if it is a reweighted sample.
- multi_add(gpr, n_points=1, bounds=None, rng=None, force_resample=False)[source]#
Method to query multiple points where the objective function shall be evaluated.
The strategy which is used to query multiple points is by using the \(f(x)\sim \mu(x)\) strategy and and not changing the hyperparameters of the model.
It runs NS on the mean of the GP model, tracking the value of the acquisition function at every evaluation, and keeping a pool of candidates which is re-sorted whenever a new good candidate is found.
When run in parallel (MPI), returns the same values for all processes.
- Parameters:
gpr (GaussianProcessRegressor) – The GP Regressor which is used as surrogate model.
n_points (int, optional (default=1)) – Number of points to be returned. A value large than 1 is useful if you can evaluate your objective in parallel, and thus obtain more objective function evaluations per unit of time.
bounds (np.array, optional) – Bounds inside which to look for the next proposals, e.g. the GPR trust region. If not defined, the prior bounds are used.
rng (int or numpy.random.Generator, optional) – The generator used to perform the acquisition process. If an integer is given, it is used as a seed for the default global numpy random number generator.
- Returns:
X (numpy.ndarray, shape = (X_dim, n_points)) – The X values of the found optima
y_lies (numpy.ndarray, shape = (n_points,)) – The predicted values of the GP at the proposed sampling locations
fval (numpy.ndarray, shape = (n_points,)) – The values of the acquisition function at X_opt
- class gp_acquisition.RankedPool(size, gpr, acq_func, verbose=1)[source]#
Bases:
objectKeeps a ranked pool of sample proposal for Krigging-believer, given a GP regressor and an acquisition function.
- Parameters:
size (int) – Number of points sampled proposals targeted.
gpr (GaussianProcessRegressor) – The GP Regressor which is used as surrogate model.
acq_func (callable) – Acquisition function used to rank the pool. Must be a function of
(y, sigma)only, partially evaluated its hyperparameters, if necessary.verbose (1, 2, 3, optional (default: 1)) – Level of verbosity. 3 prints Infos, Warnings and Errors, 2 Warnings and Errors, and 1 only Errors. Should be set to 2 or 3 if problems arise.
- property min_acq#
Minimum acquisition function value in order for a point to be considered, i.e. the conditioned acquisition function value of the last element in the pool.
NB: while the pool is not yet fool, empty points are assigned minus infinity as acquisition function value, so one can still use the condition
acq_value > RankedPool.min_acqin order to decide whether to add a point.
- log(level=None, msg='')[source]#
Print a message if its verbosity level is equal or lower than the given one (or always if
level=None.
- str_point(X, y, sigma, acq, sigma_cond=None, acq_cond=None)[source]#
Retuns a standardised string to log a point.
- str_pool(include_last=False, last_sorted=None, prefix=None, suffix_last=None)[source]#
Returns a string representation of the current pool.
- log_pool(level=4, include_last=False, last_sorted=None, prefix=None, suffix_last=None)[source]#
Prints the current pool.
- add(X, y=None, sigma=None, acq=None, method='single sort acq')[source]#
Adds points to the pool.
- Parameters:
X (np.ndarray (1- or 2-dimensional)) – Position of the proposed sample.
y (np.ndarray (1 dimension fewer than X) or float, optional) – Predicted value under the GPR.
sigma (np.ndarray (1 dimension fewer than X) or float, optional) – Predicted standard deviation under the GPR. Will be computed if not passed.
acq (np.ndarray (1 dimension fewer than X) or float, optional) – Acquisition function values (unconditioned). Will be computed if not passed.
method ({"single", "single sort acq", "single sort y", "bulk"}) – Uses the one-by-one algorithm (“single”, with pre-sorting according to X if “single sort X”), or the bulk algorithm.
- add_bulk(X, y, sigma, acq, i_start=0)[source]#
Tries to fill the pull using a batch of points at once:
Compute their acquisition value conditioned to the position above (if any).
Pick the best and delete infinities (acq cannot grow with more conditioning).
Place it in the current position, and do a recursive call for the next one.
The advantage of this method with respect to
add_oneis that it can use vectorization to compute the std’s, but on the other hand it needs to compute many more of them, so it will be better only up to some dimension and some amount of training, and thenadd_onewill take over.
- add_one(X, y=None, sigma=None, acq=None, acq_nan_is_null=False)[source]#
Tries to add one point to the pool:
Computes its acquisition function value. (If the pools is not full, just adds it and re-sorts.)
Finds the provisional position of the point in the list, without KB. Notice that once KB is taken into account, the KB-ranked position can only be lower. (If the acq. fun. value is lower than the last point, it is discarded.)
Updates its aquisition function value KB-informed by the points above, finds the new provisional position, and repeats until the position stabilises.
Sorts the list of points below the new one, recursively applying KB.
- Parameters:
X (np.ndarray with 1 dimension) – Position of the proposed sample.
y (float, optional) – Predicted value under the GPR.
sigma (float, optional) – Predicted standard deviation under the GPR.
acq (float, optional) – Value of the acquisition function.
acq_nan_is_null (bool, optional (default: False)) – Whether NaN’s in the acquisition function should be interpreted as null value.
- Raises:
ValueError – if invalid acq. function value, unless
acq_nan_is_null=True.:
- cache_model(i)[source]#
Cache the GP model that contains the training set plus the pool points up to position
i(0-based), with predicted dummy y, keeping the GPR hyperparameters unchanged.Stores and returns the conditioned gpr (or the original one if
i=-1).
- copy(drop_empty=False)[source]#
Returns a copy of the pool, missing references to external objects.
If
drop_empty=True(default:False), the returned copy has its size reduced to contain just the final set of finite conditioned acquisition points.
- sort(i_start=0)[source]#
Sorts in descending order of acquisition function value, where the acq of the
i-th element (0-based) is conditioned on the GPR model that includes the points with j<i with their predicted (mean) y.If
i_start!=0is given, assumes the upper elements in the list are already sorted following this criterion.This function assumes that the augmented model just abobe
i_starthas not already been cached, and starts by caching it and computing acquisition function valuesi_startdown.