Reference (Public API)
This section documents the exported functions, types, etc from AbstractBayesOpt.jl.
Bayesian Optimisation loop
AbstractBayesOpt.BOStruct
— TypeBOStruct{F,M<:AbstractSurrogate,A<:AbstractAcquisition,D<:AbstractDomain,X,Y,T}(
func::F,
acq::A,
model::M,
domain::D,
xs::Vector{X},
ys::Vector{Y},
ys_non_std::Vector{Y},
max_iter::Int,
iter::Int,
noise::T,
flag::Bool,
)
A structure to hold all components of the Bayesian Optimization problem.
Attributes:
func::F
: The target function to be optimized.acq::A
: The acquisition function guiding the optimization.model::M
: The surrogate model (e.g., Gaussian Process).domain::D
: The domain over which to optimize.xs::Vector{X}
: A vector of input training points.ys::Vector{Y}
: A vector of corresponding output training values.ys_non_std::Vector{Y}
: A vector of output training values before standardization.max_iter::Int
: Maximum number of iterations for the optimization.iter::Int
: Current iteration number.noise::T
: Noise level in the observations.flag::Bool
: A flag to indicate if the optimization should stop due to issues like ill-conditioning.
AbstractBayesOpt.optimize
— Functionoptimize(
BO::BOStruct;
standardize::Union{String,Nothing}="mean_scale",
hyper_params::Union{String,Nothing}="all",
num_restarts_HP::Int=1,
)
This function implements the EGO framework: While some criterion is not met, (1) optimize the acquisition function to obtain the new best candidate, (2) query the target function f, (3) update the GP and the overall optimization state. returns best found solution.
Arguments:
BO::BOStruct
: The Bayesian Optimization structure.standardize
: Specifies how to standardize the outputs.- If "mean_scale", standardize by removing mean and scaling by std.
- If "scale_only", only scale the outputs without centering (in case we set a non-zero mean function with empirical mean).
- If "mean_only", only remove the mean without scaling.
- If nothing, do not standardize the outputs.
hyper_params
: Specifies how to handle hyperparameters.- If "all", re-optimize hyperparameters every 10 iterations.
- If "lengthscaleonly", only optimize the lengthscale.
- If nothing, do not re-optimize hyperparameters.
num_restarts_HP::Int
: Number of random restarts for hyperparameter optimization.
returns:
BO::BOStruct
: The updated Bayesian Optimization problem after optimization.acqf_list::Vector
: List of acquisition function values at each iteration.standard_params::Tuple
: Tuple containing the mean and standard deviation used for standardization
Abstract Interface
AbstractBayesOpt.AbstractAcquisition
— TypeAbstractAcquisition
Abstract type for acquisition functions used in Bayesian optimization.
Concrete implementation should subtype this and implement the following methods:
(acq::AbstractAcquisition)(surrogate::AbstractSurrogate, x::AbstractVector)
: Evaluate the acquisition function at pointx
using the surrogate model. This should also work for a single real inputx::Real
if working in 1D, in which case it is treated as a one-dimensional input vector. via the abstract method defined below.update(acq::AbstractAcquisition, ys::AbstractVector, model::AbstractSurrogate)
: Update the acquisition function with new observationsys
and the current surrogate model.Base.copy(acq::AbstractAcquisition)
: Create a copy of the acquisition function.
AbstractBayesOpt.AbstractDomain
— TypeAbstractDomain
An abstract type for defining the domain over which the optimization is performed.
Concrete implementations should subtype this and define the necessary properties:
lower
: The lower bounds of the domain.upper
: The upper bounds of the domain.
as well as creating its constructor.
Other methods can be added as needed depending on the use case.
AbstractBayesOpt.AbstractSurrogate
— TypeAbstractSurrogate
Abstract type for surrogate models used in Bayesian optimization.
Concrete implementation should subtype this and implement the following methods:
update(model::AbstractSurrogate, xs::AbstractVector, ys::AbstractVector)
: Update the surrogate model with new data pointsxs
and corresponding observationsys
.posterior_mean(surrogate::AbstractSurrogate, x::AbstractVector)
: Compute the posterior mean of the surrogate model at pointx
.posterior_var(surrogate::AbstractSurrogate, x::AbstractVector)
: Compute the posterior variance of the surrogate model at pointx
.nlml(surrogate::AbstractSurrogate, params::AbstractVector, xs::AbstractVector, ys::AbstractVector)
: Compute the negative log marginal likelihood of the surrogate model given hyperparametersparams
, input dataxs
, and observationsys
.
If you wish to standardize the outputs, you can also implement:
std_y(model::AbstractSurrogate)
: Get the standard deviation used for standardizing the outputs in the surrogate model.get_mean_std(model::AbstractSurrogate)
: Get the mean and standard deviation used for standardizing the outputs in the surrogate model.
Other methods can be added as needed depending on the use case, and we refer to the impelementations of StandardGP
and GradientGP
for examples.
Surrogates
AbstractBayesOpt.StandardGP
— TypeStandardGP{T}(gp::AbstractGPs.GP, noise_var::T, gpx::Union{Nothing,AbstractGPs.PosteriorGP}) <: AbstractSurrogate
Implementation of the Abstract structures for the standard GP.
Attributes:
gp::AbstractGPs.GP
: The underlying Gaussian Process model.noise_var::T
: The noise variance of the observations.gpx::Union{Nothing,AbstractGPs.PosteriorGP}
: The posterior GP after conditioning on data,nothing
if not conditioned yet.
AbstractBayesOpt.GradientGP
— TypeGradientGP{T}(gp::AbstractGps.GP, noise_var::T, p::Int, gpx::Union{Nothing,AbstractGPs.PosteriorGP}) <: AbstractSurrogate
Implementation of the Abstract structures for the gradient-enhanced GP.
This relies on MOGP from AbstractGPs.jl and KernelFunctions.jl.
Attributes:
gp::AbstractGPs.GP
: The underlying Gaussian Process model.noise_var::T
: The noise variance of the observations.p::Int
: The number of outputs (1 for function value + d for gradients).gpx::Union{Nothing,AbstractGPs.PosteriorGP}
: The posterior GP after conditioning on data,nothing
if not conditioned yet.
AbstractBayesOpt.posterior_mean
— Functionposterior_mean(model::StandardGP, x::X) where {X}
Compute the posterior mean of the GP at a new input point.
Arguments:
model::StandardGP
: The GP model.x::X
: A new input point where the prediction is to be made.
returns:
mean
: The posterior mean prediction at the input point.
posterior_mean(model::StandardGP, x::AbstractVector)
Compute the posterior mean of the GP at set of new input points.
Arguments:
model::StandardGP
: The GP model.x::AbstractVector
: A vector of new input points where predictions are to be made.
returns:
mean
: The posterior mean predictions at the input points.
posterior_mean(model::GradientGP, x)
Compute the function mean predictions of the GP model at new input points.
Arguments:
model::GradientGP
: The GP model.x
: A vector of new input points where predictions are to be made.
returns:
mean::Vector
: The mean predictions (function value only)
AbstractBayesOpt.posterior_var
— Functionposterior_var(model::StandardGP, x::X) where {X}
Compute the posterior variance of the GP at a new input point.
Arguments:
model::StandardGP
: The GP model.x::X
: A new input point where the prediction is to be made.
returns:
var
: The posterior variance prediction at the input point.
posterior_var(model::StandardGP, x::AbstractVector)
Compute the posterior variance of the GP at set of new input points.
Arguments:
model::StandardGP
: The GP model.x::AbstractVector
: A vector of new input points where predictions are to be made
returns:
var
: The posterior variance predictions at the input points.
posterior_var(model::GradientGP, x)
Compute the function variance predictions of the GP model at new input points.
Arguments:
model::GradientGP
: The GP model.x
: A vector of new input points where predictions are to be made.
returns:
var::Vector
: The variance predictions (function value only)
AbstractBayesOpt.nlml
— Functionnlml(model::StandardGP, params, xs::AbstractVector, ys::AbstractVector)
Compute the negative log marginal likelihood (NLML) of the GP model given hyperparameters.
Arguments:
model::StandardGP
: The GP model.params
: A vector containing the log lengthscale and log scale parameters.xs::AbstractVector
: The input data points.ys::AbstractVector
: The observed function values.
returns:
- nlml::Float64: The negative log marginal likelihood of the model.
nlml(model::GradientGP, params, xs::AbstractVector, ys::AbstractVector)
Compute the negative log marginal likelihood (NLML) of the GP model given hyperparameters.
Arguments:
model::GradientGP
: The GP model.params
: Parameters containing the log lengthscale and log scale.xs::AbstractVector
: The input data points.ys::AbstractVector
: The observed function values and gradients.
returns:
- nlml::Float64: The negative log marginal likelihood of the model.
Kernels
AbstractBayesOpt.ApproxMatern52Kernel
— TypeApproxMatern52Kernel{M}(metric::M) <: KernelFunctions.SimpleKernel
Approximate Matern 5/2 kernel using a second-order Taylor expansion around d=0.
Attributes:
metric
: The distance metric to be used, defaults to squared Euclidean distance.
GradientGP-related functions
AbstractBayesOpt.gradConstMean
— TypegradConstMean{V}(c::V)
Custom mean function for the GradientGP model. Returns a constant per-output mean across MO inputs (function value + gradients). The first element corresponds to the function value, the following ones to the gradient outputs.
Use gradConstMean([μ; zeros(d)])
to set a constant prior mean μ
for the function value and zero for the gradients.
Attributes:
c::V
: A vector of constants for each output (function value + gradients).
AbstractBayesOpt.gradKernel
— TypegradKernel{K}(base_kernel::K) <: MOKernel
Custom kernel function for the GradientGP model that handles both function values and gradients.
Arguments:
base_kernel::KernelFunctions.Kernel
: The base kernel function to be used.
returns:
gradKernel
: An instance of the custom gradient kernel function.
AbstractBayesOpt.posterior_grad_mean
— Functionposterior_grad_mean(model::GradientGP, x)
Compute the mean predictions of the GP model at new input points, including gradients.
Arguments:
model::GradientGP
: The GP model.x
: A vector of new input points where predictions are to be made.
returns:
mean::Vector
: The mean predictions
AbstractBayesOpt.posterior_grad_var
— Functionposterior_grad_var(model::GradientGP, x)
Compute the variance predictions of the GP model at new input points, including gradients.
Arguments:
model::GradientGP
: The GP model.x
: A vector of new input points where predictions are to be made.
returns:
var::Vector
: The variance predictions
AbstractBayesOpt.posterior_grad_cov
— Functionposterior_grad_cov(model::GradientGP, x)
Compute the covariance matrix of the GP model at new input points, including gradients.
Arguments:
model::GradientGP
: The GP model.x
: A vector of new input points where predictions are to be made.
returns:
cov::Matrix
: The covariance matrix of the predictions
Acquisition Functions
AbstractBayesOpt.EnsembleAcquisition
— TypeEnsembleAcquisition(weights::Vector{Float64}, acqs::Vector{AbstractAcquisition}) <: AbstractAcquisition
An ensemble acquisition function combines multiple acquisition functions, each weighted by a specified factor,
Attributes:
weights::Vector{Float64}
: A vector of non-negative weights for each acquisition function. The weights are normalized to sum to 1.acquisitions::Vector{AbstractAcquisition}
: A vector of acquisition functions to be combined.
Remark: All weights must be non-negative.
AbstractBayesOpt.ExpectedImprovement
— TypeExpectedImprovement{Y}(ξ::Y, best_y::Y) <: AbstractAcquisition
Expected Improvement acquisition function.
Attributes:
ξ::Y
: Exploration parameterbest_y::Y
: Best observed objective value
References: Jones et al., 1998
AbstractBayesOpt.GradientNormUCB
— TypeGradientNormUCB{Y}(β::Y) <: AbstractAcquisition
Acquisition function implementing the Squared 2-norm of the gradient with Upper Confidence Bound (UCB) exploration strategy.
Attributes:
β::Y
: Exploration-exploitation balance parameter
References: Derived by Van Dieren, E. but open to previous references if existing. Originally proposed by Makrygiorgos et al., 2023 but adapted to the squared 2-norm of the gradient.
AbstractBayesOpt.ProbabilityImprovement
— TypeProbabilityImprovement{Y}(ξ::Y, best_y::Y) <: AbstractAcquisition
Attributes:
ξ::Y
: Exploration parameterbest_y::Y
: Best observed objective value
References: Kushner, 1964
AbstractBayesOpt.UpperConfidenceBound
— TypeUpperConfidenceBound{Y}(β::Y) <: AbstractAcquisition
Upper Confidence Bound (UCB) acquisition function.
Attributes:
β::Y
: Exploration-exploitation balance parameter
References: Srinivas et al., 2012
Domains
Continuous domain
AbstractBayesOpt.ContinuousDomain
— TypeContinuousDomain(lower::Vector{Float64}, upper::Vector{Float64}, bounds::Vector{Tuple{Float64,Float64}}) <: AbstractDomain
A concrete implementation of AbstractDomain
for continuous domains.
Attributes:
lower::Vector{Float64}
: The lower bounds of the domain.upper::Vector{Float64}
: The upper bounds of the domain.bounds::Vector{Tuple{Float64,Float64}}
: A vector of tuples representing the (lower, upper) bounds for each dimension.
Constructor:
ContinuousDomain(lower::Vector{Float64}, upper::Vector{Float64})
: Creates aContinuousDomain
instance given lower and upper bounds. Performs sanity checks to ensure the bounds are valid.