Acquisition Functions#
This module contains the implementation of an “Aquisition Function” class with a structure similar to the one provided by the Kernels module of sklearn. Additionally to some internally provided base AF’s this module also overwrites arithmetic operators for AF’s in order to enable the construction of composite AF’s.
Note
Bear in mind, that you first need to initialize an instance of an acquisition function before calling it. Composite acquisition functions are possible too, e.g.:
from acquisition_functions import ConstantAcqFunc, Mu, Sigma
af = ConstantAcqFunc(2) * Mu + (-3) * Sigma ** 2.5
Warning
Currently only the +, * and ** operator are supported but I am sure you can figure out how to
work around using - and / on your own ;)
Base Class#
All acquisition functions are derived from this class. If you want to define
your own acquisition functions it needs to inherit from this class. A tutorial
on how to define such a class is given in AcquisitionFunction
|
Base class for all Acquisition Functions (AF's). |
Inbuilt Acquisition Functions#
All inbuilt Acquisition Functions can evaluate the gradient in addition to the value of the acquisition function at point X.
The recommended acquisition that we derived in order to efficiently sample the parameter
space is called LogExp:
|
Acquisition function which is designed to efficiently sample log-probability distributions. |
Furthermore there are more inbuilt acquisition functions (building blocks) which should
offer a great deal of flexibility. If you want to define your own acquisition function
it needs to inherit from the parent class AcquisitionFunction.
|
Constant Acquisition function. |
|
\(\mu(X)\) of the surrogate model. |
|
\(\sigma(X)\) of the surrogate model. |
|
\(\exp[\mu(X)]\) of the surrogate model. |
|
\(\exp[\sigma(X)]\) of the surrogate model. |
|
Computes the (negative) Expected improvement function. |
Additional things#
The things listed here are tools and similar things which in normal operation should not be needed.
|
Determines if a given object is an acquisition function or not. |
|
An acquisition function hyperparameter's specification in form of a namedtuple. |
|
Base class for all AF operators. |
|
Overwrites the |
|
Overwrites the |
|
Defines the expontentiation of an AF with a real number. |
- acquisition_functions.builtin_names()[source]
Lists all names of all built-in acquisition functions criteria.
- class acquisition_functions.AcquisitionFunction[source]
Base class for all Acquisition Functions (AF’s). All acquisition functions are derived from this class.
Currently several acquisition functions are supported which should be versatile enough for most tasks. If however one wants to specify a custom acquisition function it should be a class which inherits from this abstract class. This class needs to be of the format:
from Acquisition_functions import Acquisition_function Class custom_acq_func(Acquisition_Function): def __init__(self, param_1, ..., fixed=..., dimension=...): # * 'hyperparam_i': The hyperparameters of the custom # acquisition function. # * 'fixed': whether the hyperparameters of the acquisition # function are to be kept fixed. # * 'dimension': the dimensionality of the target function, # which can be used to automatically adapt hyperparameters # function are to be kept fixed. # * 'hasgradient': Whether the acquisition function can return # a gradient. Furthermore the bool hasgradient needs to be # specified to 'True' if the acquisition function can return # gradient(s) or 'False' otherwise. self.param_1 = param_1 ... self.fixed=fixed self.hasgradient = True/False @property def hyperparameter_param_1(self): # Returns the type of hyperparameter and whether it is # fixed or not. This method needs to exist for every hyper- # parameter. return Hyperparameter( "param_1", "numeric", fixed=self.fixed) return Hyperparameter( "param_1", "numeric", fixed=self.fixed) def __call__(self, X, gp, eval_gradient=False): # * 'X': The value(s) at which the acquisition function is # evaluated # * 'GP': The surrogate GP model which shall be used. # * 'eval_gradient': Whether the gradient shall be given or # not. Only required if 'self.hasgradient' is true. .... # Returned are the value(s) of the acquisition function at # point(s) X and optionally their gradient(s)
Once the Acquisition function is defined in this way it can be used with the same operators as the inbuilt acquisition functions.
Note
If one of the operands of a composite acquisiton function does not return a gradient the same applies for all operands. Furthermore some optimizers require gradients, which cannot be used in this case.
- get_params(deep=True)[source]
Get hyperparameters of this Acquisition function.
- set_params(**params)[source]
Set the parameters of this acquisition function. The method works on simple AF’s as well as on nested AF’s. The latter have parameters of the form
<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Any number of parameters which shall be set. Should be of the form
{"parameter_1_name" : parameter_1_value, "parameter_2_name" : parameter_2_value, ...}- Return type:
self
- clone_with_theta(theta)[source]
Returns a clone of self with given hyperparameters theta.
- Parameters:
theta (ndarray of shape (n_dims,)) – The hyperparameters
- check_X(X)[source]
Internal method to check the dimensionality of any input X provided to an AF when called. Checks the correct type and turns X into a 2d array if a 1d array is provided.
Warning
This method only checks for the correct type of an input, inappropriate values might still cause problems.
- Parameters:
X (ndarray of shape (n_samples, n_dims) or (ndims,)) – The input array for any X value passed to an acquisition function
- Returns:
X_new – The reshaped array of input data X
- Return type:
ndarray of shape (n_samples, n_dims)
- property n_dims
Returns the number of non-fixed hyperparameters of the acquisition function.
- property hyperparameters
Returns a list of all hyperparameter specifications.
- property theta
Returns the (flattened, log-transformed) non-fixed hyperparameters.
Note that theta are typically the log-transformed values of the acquisition function’s hyperparameters as this representation of the search space is more amenable for hyperparameter search, as hyperparameters likelength-scales naturally live on a log-scale.
- Returns:
theta – The non-fixed, log-transformed hyperparameters of the acquisition function.
- Return type:
ndarray of shape (n_dims,)
- property hasgradient
Specifies whether a certain acquisition function can return gradients or not.
- abstract __call__(X, gp, eval_gradient=False)[source]
Evaluate the acquisition function.
- class acquisition_functions.ConstantAcqFunc(constant_value=1.0, fixed=False, dimension=None)[source]
Constant Acquisition function.
Can be used as part of a product-Composition where it scales the magnitude of the other factor or as part of a sum.
\[A_f(X) = constant\_value \;\forall\; X\]- Parameters:
- __call__(X, gp, eval_gradient=False)[source]
Return the Value of the AF at x (
A_f(X, gp)) and optionally its gradient.- Parameters:
X (array-like of shape (n_samples_X, n_features) or list of object) – X-Value at which the Acquisition function shall be evaluated
gp (SKLearn GaussianProcessRegressor) – The GPRegressor (surrogate model) from which to evaluate GP(X) and optionally X_train and Y_train.
eval_gradient (bool, default=False) – Determines whether the gradient with respect to X is calculated.
- Returns:
A_f (array of shape (n_samples_X)) – The value of the acquisition function at point(s) X
A_f_gradient (array of shape (n_samples_X, n_dim)) – The gradient of the Acquisition function with respect to X. Only returned when eval_gradient is True.
- class acquisition_functions.Mu(a=1.0, fixed=False, dimension=None)[source]
\(\mu(X)\) of the surrogate model.
\[A_f(X) = a\cdot\mu(X)\]- Parameters:
- __call__(X, gp, eval_gradient=False)[source]
Return the Value of the AF at x (
A_f(X, gp)) and optionally its gradient.- Parameters:
X (array-like of shape (n_samples_X, n_features) or list of object) – X-Value at which the Acquisition function shall be evaluated
gp (SKLearn GaussianProcessRegressor) – The GPRegressor (surrogate model) from which to evaluate GP(X) and optionally X_train and Y_train.
eval_gradient (bool, default=False) – Determines whether the gradient with respect to X is calculated.
- Returns:
A_f (array of shape (n_samples_X)) – The value of the acquisition function at point(s) X
A_f_gradient (array of shape (n_samples_X, n_dim)) – The gradient of the Acquisition function with respect to X. Only returned when eval_gradient is True.
- class acquisition_functions.ExponentialMu(a=1.0, fixed=False, dimension=None)[source]
\(\exp[\mu(X)]\) of the surrogate model.
\[A_f(X) = \exp(a\cdot\mu(X))\]- Parameters:
- __call__(X, gp, eval_gradient=False)[source]
Return the Value of the AF at x (
A_f(X, gp)) and optionally its gradient.- Parameters:
X (array-like of shape (n_samples_X, n_features) or list of object) – X-Value at which the Acquisition function shall be evaluated
gp (SKLearn GaussianProcessRegressor) – The GPRegressor (surrogate model) from which to evaluate GP(X) and optionally X_train and Y_train.
eval_gradient (bool, default=False) – Determines whether the gradient with respect to X is calculated.
- Returns:
A_f (array of shape (n_samples_X)) – The value of the acquisition function at point(s) X
A_f_gradient (array of shape (n_samples_X, n_dim)) – The gradient of the Acquisition function with respect to X. Only returned when eval_gradient is True.
- class acquisition_functions.Std(a=1.0, fixed=False, dimension=None)[source]
\(\sigma(X)\) of the surrogate model.
\[A_f(X) = a\cdot\sigma(X)\]- Parameters:
- __call__(X, gp, eval_gradient=False)[source]
Return the Value of the AF at x (
A_f(X, gp)) and optionally its gradient.- Parameters:
X (array-like of shape (n_samples_X, n_features) or list of object) – X-Value at which the Acquisition function shall be evaluated
gp (SKLearn GaussianProcessRegressor) – The GPRegressor (surrogate model) from which to evaluate GP(X) and optionally X_train and Y_train.
eval_gradient (bool, default=False) – Determines whether the gradient with respect to X is calculated.
- Returns:
A_f (array of shape (n_samples_X)) – The value of the acquisition function at point(s) X
A_f_gradient (array of shape (n_samples_X, n_dim)) – The gradient of the Acquisition function with respect to X. Only returned when eval_gradient is True.
- class acquisition_functions.ExponentialStd(a=1.0, fixed=False, dimension=None)[source]
\(\exp[\sigma(X)]\) of the surrogate model.
\[A_f(X) = \exp(a\cdot\sigma(X))\]- Parameters:
- __call__(X, gp, eval_gradient=False)[source]
Return the Value of the AF at x (
A_f(X, gp)) and optionally its gradient.- Parameters:
X (array-like of shape (n_samples_X, n_features) or list of object) – X-Value at which the Acquisition function shall be evaluated
gp (SKLearn GaussianProcessRegressor) – The GPRegressor (surrogate model) from which to evaluate GP(X) and optionally X_train and Y_train.
eval_gradient (bool, default=False) – Determines whether the gradient with respect to X is calculated.
- Returns:
A_f (array of shape (n_samples_X)) – The value of the acquisition function at point(s) X
A_f_gradient (array of shape (n_samples_X, n_dim)) – The gradient of the Acquisition function with respect to X. Only returned when eval_gradient is True.
- class acquisition_functions.ExpectedImprovement(xi=0.01, fixed=False, dimension=None)[source]
Computes the (negative) Expected improvement function.
The conditional probability P(y=f(x) | x) form a gaussian with a certain mean and standard deviation approximated by the model.
The EI condition is derived by computing \(E[u(f(x))]\) where \(u(f(x)) = 0\), if \(f(x) > y_{\mathrm{opt}}\) and \(u(f(x)) = y_{\mathrm{opt}} - f(x)\), if \(f(x) < y_{\mathrm{opt}}\).
This solves one of the issues of the PI condition by giving a reward proportional to the amount of improvement got.
- Parameters:
- __call__(X, gp, eval_gradient=False)[source]
Return the Value of the AF at x (
A_f(X, gp)) and optionally its gradient.- Parameters:
X (array-like of shape (n_samples_X, n_features) or list of object) – X-Value at which the Acquisition function shall be evaluated
gp (SKLearn GaussianProcessRegressor) – The GPRegressor (surrogate model) from which to evaluate GP(X) and optionally X_train and Y_train.
eval_gradient (bool, default=False) – Determines whether the gradient with respect to X is calculated.
- Returns:
A_f (array of shape (n_samples_X)) – The value of the acquisition function at point(s) X
A_f_gradient (array of shape (n_samples_X, n_dim)) – The gradient of the Acquisition function with respect to X. Only returned when eval_gradient is True.
- class acquisition_functions.BaseLogExp(zeta=None, sigma_n=None, fixed=False, dimension=None, zeta_scaling=0.85, linear=True)[source]
Acquisition function which is designed to efficiently sample log-probability distributions. This is achieved by transforming \(\tilde{\mu}\cdot\tilde{\sigma}\) (of the true, non-logarithmic probability distribution) to logarithmic space.
- Parameters:
zeta (float, default=1) –
Controls the exploration-exploitation tradeoff parameter. The value of \(\zeta\) should not exceed 1 under normal circumstances as a value <1 accounts for the fact that the GP’s estimate for \(\mu\) is not correct at the beginning. A good suggestion for setting zeta which is inspired by simulated annealing is
\[\zeta = \exp(-N_0/N)\]where \(N_0\geq 0\) is a “decay constant” and \(N\) the number of training points in the GP.
sigma_n (float, default=None) – The (constant) noise level of the data. If set to
Nonethe square-root of alpha of the training data (or the square root of the mean of alpha if alpha is an array) will be used.fixed (bool, default=False,) – whether zeta and sigma_n shall be fixed or not.
dimension (double, default=None) – the dimension of the parameter space used for auto-scaling the zeta
zeta_scaling (double, default=0.85) – the scaling power of the zeta with dimension, if auto-scaled
- __call__(X, gp, eval_gradient=False)[source]
Return the Value of the AF at x (
A_f(X, gp)) and optionally its gradient.- Parameters:
X (array-like of shape (n_samples_X, n_features) or list of object) – X-Value at which the Acquisition function shall be evaluated
gp (SKLearn GaussianProcessRegressor) – The GPRegressor (surrogate model) from which to evaluate GP(X) and optionally X_train and Y_train.
eval_gradient (bool, default=False) – Determines whether the gradient with respect to X is calculated.
- Returns:
A_f (array of shape (n_samples_X)) – The value of the acquisition function at point(s) X
A_f_gradient (array of shape (n_samples_X, n_dim)) – The gradient of the Acquisition function with respect to X. Only returned when eval_gradient is True.
- class acquisition_functions.LogExp(zeta=None, sigma_n=None, fixed=False, dimension=None, zeta_scaling=0.85, linear=True)[source]
Acquisition function which is designed to efficiently sample log-probability distributions. This is achieved by transforming \(\tilde{\mu}\cdot\tilde{\sigma}\) (of the true, non-logarithmic probability distribution) to logarithmic space which yields
\[A_{\mathrm{LE}}(X) = \exp(2\zeta\cdot\mu(X))\cdot (\sigma(X)-\sigma_n)\]For numerical convenience we take the log of this expression which yields:
\[\log(A_{\mathrm{LE}})(X) = 2\zeta\cdot\mu(X) + \log(\sigma(X)-\sigma_n)\]Note
\(\mu(x)\) and \(\sigma(X)\) are the mean and sigma of the GP regressor which follows the log-probability distribution.
- Parameters:
zeta (float, default=1) –
Controls the exploration-exploitation tradeoff parameter. The value of \(\zeta\) should not exceed 1 under normal circumstances as a value <1 accounts for the fact that the GP’s estimate for \(\mu\) is not correct at the beginning. A good suggestion for setting zeta which is inspired by simulated annealing is
\[\zeta = \exp(-N_0/N)\]where \(N_0\geq 0\) is a “decay constant” and \(N\) the number of training points in the GP.
sigma_n (float, default=None) – The (constant) noise level of the data. If set to
Nonethe square-root of alpha of the training data (or the square root of the mean of alpha if alpha is an array) will be used.fixed (bool, default=False,) – whether zeta and sigma_n shall be fixed or not.
dimension (double, default=None) – the dimension of the parameter space used for auto-scaling the zeta
zeta_scaling (double, default=0.85) – the scaling power of the zeta with dimension, if auto-scaled
- static f(mu, std, baseline, noise_level, zeta)[source]
Linearized exponentiated log-error bar.
- class acquisition_functions.NonlinearLogExp(zeta=None, sigma_n=None, fixed=False, dimension=None, zeta_scaling=0.85, linear=True)[source]
Warning
The gradients for this acquisition function are not yet implemented correctly. Use with caution!
An alternative approach which keeps both scales exponentiated:
\[A_{\mathrm{LE}}(X) = \exp(2\zeta\cdot\mu(X))\cdot \exp(\sigma(X)-\sigma_n)\]Again we take the log of this.
- Parameters:
zeta (float, default=1) –
Controls the exploration-exploitation tradeoff parameter. The value of \(\zeta\) should not exceed 1 under normal circumstances as a value <1 accounts for the fact that the GP’s estimate for \(\mu\) is not correct at the beginning. A good suggestion for setting zeta which is inspired by simulated annealing is
\[\zeta = \exp(-N_0/N)\]where \(N_0\geq 0\) is a “decay constant” and \(N\) the number of training points in the GP.
sigma_n (float, default=None) – The (constant) noise level of the data. If set to
Nonethe square-root of alpha of the training data (or the square root of the mean of alpha if alpha is an array) will be used.fixed (bool, default=False,) – whether zeta and sigma_n shall be fixed or not.
dimension (double, default=None) – the dimension of the parameter space used for auto-scaling the zeta
zeta_scaling (double, default=0.85) – the scaling power of the zeta with dimension, if auto-scaled
- static f(mu, std, baseline, noise_level, zeta)[source]
Exponentiated log-error bar
- acquisition_functions.is_acquisition_function(acq_func)[source]
Determines if a given object is an acquisition function or not.
- Parameters:
acq_func (Any) – The object which shall be examined
- Returns:
is_acquisition_function – whether the specified object is an acquisition function.
- Return type:
- class acquisition_functions.Hyperparameter(name, value_type, n_elements=1, fixed=False)[source]
An acquisition function hyperparameter’s specification in form of a namedtuple. This formalism is copied from the
kernelmodule of Scikit-Learn.Note
The current code does not support optimization of any hyperparameters of the acquisition functions. This might be added in the future (which is why the
fixedparameter exists).- Attributes:
name (str) – The name of the hyperparameter. Unlike in the kernels for the GPRegressor the hyperparameters of the Acquisition functions do not have bounds.
value_type (str) – The type of the hyperparameter. Currently, only
'numeric'hyperparameters are supported.n_elements (int, default=1) – The number of elements of the hyperparameter value. Defaults to 1, which corresponds to a scalar hyperparameter. n_elements > 1 corresponds to a hyperparameter which is vector-valued, such as, e.g., anisotropic length-scales.
fixed (bool, default=False) – Whether the value of this hyperparameter is fixed, i.e., cannot be changed during hyperparameter tuning.
- class acquisition_functions.AcquisitionFunctionOperator(k1, k2)[source]
Base class for all AF operators.
- get_params(deep=True)[source]
Get parameters of this acquisition function.
- property hyperparameters
Returns a list of all hyperparameter.
- property theta
Returns the (flattened, log-transformed) non-fixed hyperparameters. Note that theta are typically the log-transformed values of the AF’s hyperparameters as this representation of the search space is more amenable for hyperparameter search, as hyperparameters like length-scales naturally live on a log-scale.
- Returns:
theta – The non-fixed, log-transformed hyperparameters of the acquisition function
- Return type:
ndarray of shape (n_dims,)
- class acquisition_functions.Sum(k1, k2)[source]
Overwrites the
+operator for two or more AF’s. Additionally gradients are computed and calculated together according to the rules of differentiation.A sum of an AF can be either with another (composite) AF or a real number.
- __call__(X, gp, eval_gradient=False)[source]
Evaluate the acquisition function.
- class acquisition_functions.Product(k1, k2)[source]
Overwrites the
*operator for two or more AF’s. Additionally gradients are computed and calculated together according to the rules of differentiation.A product of an AF can be either with another (composite) AF or a real number.
- __call__(X, gp, eval_gradient=False)[source]
Evaluate the acquisition function.
- class acquisition_functions.Exponentiation(acquisition_function, exponent)[source]
Defines the expontentiation of an AF with a real number. Additionally gradients are computed and calculated together according to the rules of differentiation.
Warning
An AF can only be exponentiated with a number and not with another AF:
new_af = old_af ** number
- get_params(deep=True)[source]
Get parameters of this Acquisition function.
- property hyperparameters
Returns a list of all hyperparameter.
- property theta
Returns the (flattened, log-transformed) non-fixed hyperparameters. Note that theta are typically the log-transformed values of the acquisition function’s hyperparameters as this representation of the search space is more amenable for hyperparameter search, as hyperparameters like length-scales naturally live on a log-scale.
- Returns:
theta – The non-fixed, log-transformed hyperparameters of the acquisition function
- Return type:
ndarray of shape (n_dims,)
- __call__(X, gp, eval_gradient=False)[source]
Return the Value of the AF at x (
A_f(X, gp)) and optionally its gradient.- Parameters:
X (array-like of shape (n_samples_X, n_features) or list of object) – X-Value at which the Acquisition function shall be evaluated
gp (SKLearn GaussianProcessRegressor) – The GPRegressor (surrogate model) from which to evaluate GP(X) and optionally X_train and Y_train.
eval_gradient (bool, default=False) – Determines whether the gradient with respect to X is calculated.
- Returns:
A_f (array of shape (n_samples_X)) – The value of the acquisition function at point(s) X
A_f_gradient (array of shape (n_samples_X, n_dim)) – The gradient of the Acquisition function with respect to X. Only returned when eval_gradient is True.