Convergence#
Convergence policy#
You can define a policy for each convergence criterion:
'n': necessary (default if not specified)'s': sufficient'ns': necessary and sufficient'm': monitoring (tracked, but will not affect convergence)
If there are no criteria specified as necessary or sufficient, i.e. all criteria are set to monitor, the run will never converge (but it will stop at evaluation budget exhaustion).
This module contains several classes and methods for calculating different convergence criterions which can be used to determine if the BO loop has converged.
- exception convergence.ConvergenceCheckError[source]#
Bases:
ExceptionException to be raised when the computation of the convergence criterion failed.
- class convergence.ConvergenceCriterion(prior_bounds, params)[source]#
Bases:
objectBase class for all convergence criteria (CCs). A CC quantifies the convergence of the GP surrogate model. If this value goes below a certain, user-set value we consider the GP to have converged to the true posterior distribution.
Currently several CCs are supported which should be versatile enough for most tasks. If however one wants to specify a custom CC it should be a class which inherits from this abstract class. This class needs to be of the format:
from gpry.convergence import ConvergenceCriterion class Custom_convergence_criterion(ConvergenceCriterion): def __init__(self, prior_bounds, params): # prior_bounds should be a list of prior bounds for all parameters. # As a minimal requirement this method should set a number # at which the algorithm is considered to have converged. # Furthermore this method should initialize empty lists in # which we can write the values of the convergence criterion # as well as the total number of posterior evaluations and the # number of accepted posterior evaluations. This allows for # easy tracking/plotting of the convergence. self.values = [] self.n_posterior_evals = [] self.n_accepted_evals = [] self.limit = ... # stores the limit for convergence def is_converged(self, gp, gp_2=None, new_X=None, new_y=None, pred_y=None): # Basically a wrapper for the 'criterion_value' method which # returns True if the convergence criterion is met and False # otherwise. def criterion_value(): # Returns the value of the convergence criterion. Should also # append the current value and the number of posterior # evaluations to the corresponding variables.
- get_history()[source]#
Returns the two lists containing the values of the convergence criterion at each step as well as the total number of evaluations and the number of accepted evaluations.
- abstract is_converged(gp, gp_2=None, new_X=None, new_y=None, pred_y=None, acquisition=None)[source]#
Returns False if the algorithm hasn’t converged and True if it has.
If gp_2 is None the last GP is taken from the model instance.
- abstract criterion_value(gp, gp_2=None)[source]#
Returns the value of the convergence criterion for the current gp. If gp_2 is None the last GP is taken from the model instance.
- property last_value#
Last value of the convergence criterion.
- property is_MPI_aware#
Should return True if the convergence criterion should run in multiple processes using MPI communication.
- property convergence_policy#
Returns a string describing the convergence policy.
- property convergence_policy_MPI#
Returns a string describing the convergence policy (MPI-wrapped!)
- class convergence.DontConverge(prior_bounds=None, params=None)[source]#
Bases:
ConvergenceCriterionThis convergence criterion is mainly for testing purposes and always returns False when
is_convergedis called. Use this method together with themax_pointsandmax_acceptedkeys in the options dict to stop the BO loop at a set number of iterations.
- class convergence.GaussianKL(prior_bounds, params)[source]#
Bases:
ConvergenceCriterionThis criterion estimates convergence as stability of the Gaussian-approximated, single-mode KL divergence of surrogate posterior samples between runs.
If a valid GPAcquisition instance is passed to
is_converged, mean and covariance will be extracted from it. Otherwise, it estimates the mean and covariance by running an MCMC sampler on the GP (slow).In the second case, this convergence criterion is MPI-aware, such that it will run as many parallel MCMC chains as running processes to improve the estimation of the mean and covariance.
- Parameters:
prior_bounds (list) – List of prior bounds.
params (dict) –
Dict with the following keys:
"limit": Value of the KL divergence for which we consider the algorithmconverged (default
2e-2).
"limit_times": Number of consecutive times that the KL divergence must belower than the
limitparameter (default2).
"n_draws": Number of steps of the MCMC chain (default: ignored in favour of"n_draws_per_dimsquared").
"n_draws_per_dimsquared": idem, as a factor of the dimensionality squared(default 10).
"max_reused": number of times a sample can be reweighted and reused (maymiss new high-value regions) (default 4).
- property is_MPI_aware#
Should return True if the convergence criterion should run in multiple processes using MPI communication.
- _get_new_mean_and_cov_from_acquisition(acquisition)[source]#
Tries to extract the mean and covmat from the acquisition object.
Raises AttributeError for null acquisition object or it not having samples.
- class convergence.GaussianKLTrain(prior_bounds, params)[source]#
Bases:
GaussianKLThis criterion is not aimed at estimating convergence, but at discarding cases in which a MC sample from the GPR (the last one obtained by the acquisition step, if it exists, otherwise computed on the fly) would not sample the mode mapped by the training set, but instead some overshooting or large baseline plateau. It compares the Gaussian approximation of the last MC sample by the acquisition step with the mean and covariance matrix computed from the training set using probabilities as weights.
Since its a check in the current iteration, by default it is enough for this criterion to be satisfied in the last step, and with a high tolerance, since it affects extreme cases only.
At the moment, it assumes that there is a single mode.
If a valid GPAcquisition instance is passed to
is_converged, mean and covariance will be extracted from it. Otherwise, it estimates the mean and covariance by running an MCMC sampler on the GP (slow).In the second case, this convergence criterion is MPI-aware, such that it will run as many parallel MCMC chains as running processes to improve the estimation of the mean and covariance.
- Parameters:
prior_bounds (list) – List of prior bounds.
params (dict) –
Dict with the following keys:
"limit": Value of the KL divergence for which we consider the algorithmconverged (default
2e-2).
"limit_times": Number of consecutive times that the KL divergence must belower than the
limitparameter (default2).
"n_draws": Number of steps of the MCMC chain (default: ignored in favour of"n_draws_per_dimsquared").
"n_draws_per_dimsquared": idem, as a factor of the dimensionality squared(default 10).
"max_reused": number of times a sample can be reweighted and reused (maymiss new high-value regions) (default 4).
- class convergence.TrainAlignment(prior_bounds, params)[source]#
Bases:
GaussianKLThis criterion is not aimed at estimating convergence, but at discarding cases in which a MC sample from the GPR (the last one obtained by the acquisition step, if it exists, otherwise computed on the fly) would not sample the mode mapped by the training set, but instead some overshooting or large baseline plateau.
It computes the minimum central confidence level of the mean of the training set with respect to a Gaussian approximation of the surrogate posterior.
Its maximum value is obviously 1, and for the kind of test that this criterion addresses a value below 0.5 should be enough. It’s minimum value is clipped at 0.001, to avoid spoiling the convergence plots with numerical noise.
Since its a check in the current iteration, by default it is enough for this criterion to be satisfied in the last step, and with a high tolerance, since it affects extreme cases only.
At the moment, it assumes that there is a single mode.
If a valid GPAcquisition instance is passed to
is_converged, mean and covariance will be extracted from it. Otherwise, it estimates the mean and covariance by running an MCMC sampler on the GP (slow).In the second case, this convergence criterion is MPI-aware, such that it will run as many parallel MCMC chains as running processes to improve the estimation of the mean and covariance.
- Parameters:
prior_bounds (list) – List of prior bounds.
params (dict) –
Dict with the following keys:
"frac_training": fraction, starting from the latest, of the training set tobe used (default: 1)
"limit": Probability mass within the minimum CL enclosing the training mean(default
0.5).
"limit_times": Number of consecutive times that the criterion must befulfilled (default
1).
"n_draws": Number of steps of the MCMC chain (default: ignored in favour of"n_draws_per_dimsquared").
"n_draws_per_dimsquared": idem, as a factor of the dimensionality squared(default 10).
"max_reused": number of times a sample can be reweighted and reused (maymiss new high-value regions) (default 4).
- class convergence.CorrectCounter(prior_bounds, params)[source]#
Bases:
ConvergenceCriterionThis convergence criterion determines convergence by requiring that the GP’s predictions of the posterior values in the last \(n\) steps are correct up to a certain threshold. This condition is fulfilled if
\[|f(x)-\overline{f}_{\mathrm{GP}}(x)| < (f_{\mathrm{max}}(x) - f(x)) \cdot r + a\]where the parameters \(r\) and \(a\) are the relative and absolute tolerances controlled by the reltol and abstol parameters. We set the “value” of the criterion to be the maximum difference of the GP prediction and the true posterior in the last batch of accepted evaluations. Furthermore this class contains an internal list thres which contains the threshold values corresponding to this difference.
- Parameters:
prior_bounds (list) – List of prior bounds.
params (dict) –
Dict with the following keys:
"n_correct": Number of consecutive samples which need to be under thethreshold (default
max(4, 0.5*N_d))
"reltol": Relative tolerance parameter (default0.01)"abstol": Absolute tolerance parameter (default"0.01s")"verbose": Verbosity
Note
The
"reltol"and"abstol"parameters can be passed as a string ending with either"l"or"s". In this case the value of this parameter is scaled with the number of dimensions as either linear ("l") or square ("s") of the depth of the \(\chi^2\) of the \(1-\sigma\)-contour assuming a gaussian distribution.
- is_converged(gp, gp_2=None, new_X=None, new_y=None, pred_y=None, acquisition=None)[source]#
Returns False if the algorithm hasn’t converged and True if it has.
If gp_2 is None the last GP is taken from the model instance.
- criterion_value(gp, gp_2=None, new_X=None, new_y=None, pred_y=None)[source]#
Returns the value of the convergence criterion for the current gp. If gp_2 is None the last GP is taken from the model instance.
- property limit#
Limit for the criterion value (changes along iterations for this criterion).