Reference (Public API)
This section documents the exported functions, types, etc from AbstractBayesOpt.jl.
Bayesian Optimisation loop
AbstractBayesOpt.BOStruct — Type
BOStruct{F,M<:AbstractSurrogate,A<:AbstractAcquisition,D<:AbstractDomain,X,Y,T}(
func::F,
acq::A,
model::M,
domain::D,
xs::Vector{X},
ys::Vector{Y},
ys_non_std::Vector{Y},
max_iter::Int,
iter::Int,
noise::T,
flag::Bool,
)A structure to hold all components of the Bayesian Optimization problem.
Attributes:
func::F: The target function to be optimized.acq::A: The acquisition function guiding the optimization.model::M: The surrogate model (e.g., Gaussian Process).domain::D: The domain over which to optimize.xs::Vector{X}: A vector of input training points.ys::Vector{Y}: A vector of corresponding output training values.ys_non_std::Vector{Y}: A vector of output training values before standardization.max_iter::Int: Maximum number of iterations for the optimization.iter::Int: Current iteration number.noise::T: Noise level in the observations.flag::Bool: A flag to indicate if the optimization should stop due to issues like ill-conditioning.
AbstractBayesOpt.optimize — Function
optimize(
BO::BOStruct;
standardize::Union{String,Nothing}="mean_scale",
hyper_params::Union{String,Nothing}="all",
num_restarts_HP::Int=1,
ad_backend::Symbol=:forward
)This function implements the EGO framework: While some criterion is not met, (1) optimize the acquisition function to obtain the new best candidate, (2) query the target function f, (3) update the GP and the overall optimization state. returns best found solution.
Arguments:
BO::BOStruct: The Bayesian Optimization structure.standardize: Specifies how to standardize the outputs.- If "mean_scale", standardize by removing mean and scaling by std.
- If "scale_only", only scale the outputs without centering (in case we set a non-zero mean function with empirical mean).
- If "mean_only", only remove the mean without scaling.
- If nothing, do not standardize the outputs.
hyper_params: Specifies how to handle hyperparameters.- If "all", re-optimize hyperparameters every 10 iterations.
- If "lengthscaleonly", only optimize the lengthscale.
- If nothing, do not re-optimize hyperparameters.
num_restarts_HP::Int: Number of random restarts for hyperparameter optimization.ad_backend::Symbol: The automatic differentiation backend to use for hyperparameter optimization.
returns:
BO::BOStruct: The updated Bayesian Optimization problem after optimization.acqf_list::Vector: List of acquisition function values at each iteration.standard_params::Tuple: Tuple containing the mean and standard deviation used for standardization
Abstract Interface
AbstractBayesOpt.AbstractAcquisition — Type
AbstractAcquisitionAbstract type for acquisition functions used in Bayesian optimization.
Concrete implementation should subtype this and implement the following methods:
(acq::AbstractAcquisition)(surrogate::AbstractSurrogate, x::AbstractVector): Evaluate the acquisition function at pointxusing the surrogate model. This should also work for a single real inputx::Realif working in 1D, in which case it is treated as a one-dimensional input vector. via the abstract method defined below.update(acq::AbstractAcquisition, ys::AbstractVector, model::AbstractSurrogate): Update the acquisition function with new observationsysand the current surrogate model.Base.copy(acq::AbstractAcquisition): Create a copy of the acquisition function.
AbstractBayesOpt.AbstractDomain — Type
AbstractDomainAn abstract type for defining the domain over which the optimization is performed.
Concrete implementations should subtype this and define the necessary properties:
lower: The lower bounds of the domain.upper: The upper bounds of the domain.
as well as creating its constructor.
Other methods can be added as needed depending on the use case.
AbstractBayesOpt.AbstractSurrogate — Type
AbstractSurrogateAbstract type for surrogate models used in Bayesian optimization.
Concrete implementation should subtype this and implement the following methods:
update(model::AbstractSurrogate, xs::AbstractVector, ys::AbstractVector): Update the surrogate model with new data pointsxsand corresponding observationsys.posterior_mean(surrogate::AbstractSurrogate, x::AbstractVector): Compute the posterior mean of the surrogate model at pointx.posterior_var(surrogate::AbstractSurrogate, x::AbstractVector): Compute the posterior variance of the surrogate model at pointx.nlml(surrogate::AbstractSurrogate, params::AbstractVector, xs::AbstractVector, ys::AbstractVector): Compute the negative log marginal likelihood of the surrogate model given hyperparametersparams, input dataxs, and observationsys.
If you wish to standardize the outputs, you can also implement:
std_y(model::AbstractSurrogate): Get the standard deviation used for standardizing the outputs in the surrogate model.get_mean_std(model::AbstractSurrogate): Get the mean and standard deviation used for standardizing the outputs in the surrogate model.
Other methods can be added as needed depending on the use case, and we refer to the impelementations of StandardGP and GradientGP for examples.
Surrogates
AbstractBayesOpt.StandardGP — Type
StandardGP{T}(gp::AbstractGPs.GP, noise_var::T, gpx::Union{Nothing,AbstractGPs.PosteriorGP}) <: AbstractSurrogateImplementation of the Abstract structures for the standard GP.
Attributes:
gp::AbstractGPs.GP: The underlying Gaussian Process model.noise_var::T: The noise variance of the observations.gpx::Union{Nothing,AbstractGPs.PosteriorGP}: The posterior GP after conditioning on data,nothingif not conditioned yet.
AbstractBayesOpt.GradientGP — Type
struct GradientGP{T, G<:AbstractGPs.GP} <: AbstractSurrogateGradient-enhanced Gaussian Process surrogate model.
Attributes:
gp::G: The underlying Gaussian Process model.noise_var::T: The noise variance of the observations.p::Int: The number of outputs (1 for function value + d for gradients).gpx::Union{Nothing, AbstractGPs.PosteriorGP}: The posterior GP after conditioning on data,nothingif not conditioned yet.
Note: gpx is the posterior GP after conditioning on data, nothing if not conditioned yet
This relies on MOGP from AbstractGPs.jl and KernelFunctions.jl.
AbstractBayesOpt.posterior_mean — Function
posterior_mean(model::StandardGP, x::X) where {X}Compute the posterior mean of the GP at a new input point.
Arguments:
model::StandardGP: The GP model.x::X: A new input point where the prediction is to be made.
returns:
mean: The posterior mean prediction at the input point.
posterior_mean(model::StandardGP, x::AbstractVector)Compute the posterior mean of the GP at set of new input points.
Arguments:
model::StandardGP: The GP model.x::AbstractVector: A vector of new input points where predictions are to be made.
returns:
mean: The posterior mean predictions at the input points.
posterior_mean(model::GradientGP, x)Compute the function mean predictions of the GP model at new input points.
Arguments:
model::GradientGP: The GP model.x: A vector of new input points where predictions are to be made.
returns:
mean::Vector: The mean predictions (function value only)
AbstractBayesOpt.posterior_var — Function
posterior_var(model::StandardGP, x::X) where {X}Compute the posterior variance of the GP at a new input point.
Arguments:
model::StandardGP: The GP model.x::X: A new input point where the prediction is to be made.
returns:
var: The posterior variance prediction at the input point.
posterior_var(model::StandardGP, x::AbstractVector)Compute the posterior variance of the GP at set of new input points.
Arguments:
model::StandardGP: The GP model.x::AbstractVector: A vector of new input points where predictions are to be made
returns:
var: The posterior variance predictions at the input points.
posterior_var(model::GradientGP, x)Compute the function variance predictions of the GP model at new input points.
Arguments:
model::GradientGP: The GP model.x: A vector of new input points where predictions are to be made.
returns:
var::Vector: The variance predictions (function value only)
AbstractBayesOpt.nlml — Function
nlml(model::StandardGP, params, xs::AbstractVector, ys::AbstractVector)Compute the negative log marginal likelihood (NLML) of the GP model given hyperparameters.
Arguments:
model::StandardGP: The GP model.params: A vector containing the log lengthscale and log scale parameters.xs::AbstractVector: The input data points.ys::AbstractVector: The observed function values.
returns:
- nlml::Float64: The negative log marginal likelihood of the model.
nlml(model::GradientGP, params, xs::AbstractVector, ys::AbstractVector)Compute the negative log marginal likelihood (NLML) of the GP model given hyperparameters.
Arguments:
model::GradientGP: The GP model.params: Parameters containing the log lengthscale and log scale.xs::AbstractVector: The input data points.ys::AbstractVector: The observed function values and gradients.
returns:
- nlml::Float64: The negative log marginal likelihood of the model.
Kernels
AbstractBayesOpt.ApproxMatern52Kernel — Type
ApproxMatern52Kernel{M}(metric::M) <: KernelFunctions.SimpleKernelApproximate Matern 5/2 kernel using a second-order Taylor expansion around d=0.
Attributes:
metric: The distance metric to be used, defaults to squared Euclidean distance.
AbstractBayesOpt.ApproxMatern72Kernel — Type
ApproxMatern72Kernel{M}(metric::M) <: KernelFunctions.SimpleKernelApproximate Matern 7/2 kernel using a second-order Taylor expansion around d=0.
Attributes:
metric: The distance metric to be used, defaults to squared Euclidean distance.
AbstractBayesOpt.ADMatern52Kernel — Type
ADMatern52Kernel{M} <: KernelFunctions.SimpleKernelMatern 5/2 kernel with custom differentiation rules for gradient computations.
Attributes:
metric: The distance metric to be used, defaults to squared Euclidean distance.
AbstractBayesOpt.ADMatern72Kernel — Type
ADMatern72Kernel{M} <: KernelFunctions.SimpleKernelMatern 7/2 kernel with custom differentiation rules for gradient computations.
Attributes:
metric: The distance metric to be used, defaults to squared Euclidean distance.
GradientGP-related functions
AbstractBayesOpt.gradConstMean — Type
gradConstMean{V}(c::V)Custom mean function for the GradientGP model. Returns a constant per-output mean across MO inputs (function value + gradients). The first element corresponds to the function value, the following ones to the gradient outputs.
Use gradConstMean([μ; zeros(d)]) to set a constant prior mean μ for the function value and zero for the gradients.
Attributes:
c::V: A vector of constants for each output (function value + gradients).
AbstractBayesOpt.gradKernel — Type
gradKernel{K}(base_kernel::K) <: MOKernelCustom kernel function for the GradientGP model that handles both function values and gradients.
Arguments:
base_kernel::KernelFunctions.Kernel: The base kernel function to be used.
returns:
gradKernel: An instance of the custom gradient kernel function.
AbstractBayesOpt.posterior_grad_mean — Function
posterior_grad_mean(model::GradientGP, x)Compute the mean predictions of the GP model at new input points, including gradients.
Arguments:
model::GradientGP: The GP model.x: A vector of new input points where predictions are to be made.
returns:
mean::Vector: The mean predictions
AbstractBayesOpt.posterior_grad_var — Function
posterior_grad_var(model::GradientGP, x)Compute the variance predictions of the GP model at new input points, including gradients.
Arguments:
model::GradientGP: The GP model.x: A vector of new input points where predictions are to be made.
returns:
var::Vector: The variance predictions
AbstractBayesOpt.posterior_grad_cov — Function
posterior_grad_cov(model::GradientGP, x)Compute the covariance matrix of the GP model at new input points, including gradients.
Arguments:
model::GradientGP: The GP model.x: A vector of new input points where predictions are to be made.
returns:
cov::Matrix: The covariance matrix of the predictions
Acquisition Functions
AbstractBayesOpt.EnsembleAcquisition — Type
EnsembleAcquisition(weights::Vector{Float64}, acqs::Vector{AbstractAcquisition}) <: AbstractAcquisitionAn ensemble acquisition function combines multiple acquisition functions, each weighted by a specified factor,
Attributes:
weights::Vector{Float64}: A vector of non-negative weights for each acquisition function. The weights are normalized to sum to 1.acquisitions::Vector{AbstractAcquisition}: A vector of acquisition functions to be combined.
Remark: All weights must be non-negative.
AbstractBayesOpt.ExpectedImprovement — Type
ExpectedImprovement{Y}(ξ::Y, best_y::Y) <: AbstractAcquisitionExpected Improvement acquisition function.
Attributes:
ξ::Y: Exploration parameterbest_y::Y: Best observed objective value
References: Jones et al., 1998
AbstractBayesOpt.GradientNormUCB — Type
GradientNormUCB{Y}(β::Y) <: AbstractAcquisitionAcquisition function implementing the Squared 2-norm of the gradient with Upper Confidence Bound (UCB) exploration strategy.
Attributes:
β::Y: Exploration-exploitation balance parameter
References: Derived by Van Dieren, E. but open to previous references if existing. Originally proposed by Makrygiorgos et al., 2023 but adapted to the squared 2-norm of the gradient.
AbstractBayesOpt.ProbabilityImprovement — Type
ProbabilityImprovement{Y}(ξ::Y, best_y::Y) <: AbstractAcquisitionAttributes:
ξ::Y: Exploration parameterbest_y::Y: Best observed objective value
References: Kushner, 1964
AbstractBayesOpt.UpperConfidenceBound — Type
UpperConfidenceBound{Y}(β::Y) <: AbstractAcquisitionUpper Confidence Bound (UCB) acquisition function.
Attributes:
β::Y: Exploration-exploitation balance parameter
References: Srinivas et al., 2012
Domains
Continuous domain
AbstractBayesOpt.ContinuousDomain — Type
ContinuousDomain(lower::Vector{Float64}, upper::Vector{Float64}, bounds::Vector{Tuple{Float64,Float64}}) <: AbstractDomainA concrete implementation of AbstractDomain for continuous domains.
Attributes:
lower::Vector{Float64}: The lower bounds of the domain.upper::Vector{Float64}: The upper bounds of the domain.bounds::Vector{Tuple{Float64,Float64}}: A vector of tuples representing the (lower, upper) bounds for each dimension.
Constructor:
ContinuousDomain(lower::Vector{Float64}, upper::Vector{Float64}): Creates aContinuousDomaininstance given lower and upper bounds. Performs sanity checks to ensure the bounds are valid.