Reference (Public API)

This section documents the exported functions, types, etc from AbstractBayesOpt.jl.

Bayesian Optimisation loop

AbstractBayesOpt.BOStruct — Type

BOStruct{F,M<:AbstractSurrogate,A<:AbstractAcquisition,D<:AbstractDomain,X,Y,T}(
    func::F,
    acq::A,
    model::M,
    domain::D,
    xs::Vector{X},
    ys::Vector{Y},
    ys_non_std::Vector{Y},
    max_iter::Int,
    iter::Int,
    noise::T,
    flag::Bool,
)

A structure to hold all components of the Bayesian Optimization problem.

Attributes:

func::F: The target function to be optimized.
acq::A: The acquisition function guiding the optimization.
model::M: The surrogate model (e.g., Gaussian Process).
domain::D: The domain over which to optimize.
xs::Vector{X}: A vector of input training points.
ys::Vector{Y}: A vector of corresponding output training values.
ys_non_std::Vector{Y}: A vector of output training values before standardization.
max_iter::Int: Maximum number of iterations for the optimization.
iter::Int: Current iteration number.
noise::T: Noise level in the observations.
flag::Bool: A flag to indicate if the optimization should stop due to issues like ill-conditioning.

source

AbstractBayesOpt.optimize — Function

optimize(
    BO::BOStruct;
    standardize::Union{String,Nothing}="mean_scale",
    hyper_params::Union{String,Nothing}="all",
    num_restarts_HP::Int=1,
    ad_backend::Symbol=:forward
)

This function implements the EGO framework: While some criterion is not met, (1) optimize the acquisition function to obtain the new best candidate, (2) query the target function f, (3) update the GP and the overall optimization state. returns best found solution.

Arguments:

BO::BOStruct: The Bayesian Optimization structure.
standardize: Specifies how to standardize the outputs.
- If "mean_scale", standardize by removing mean and scaling by std.
- If "scale_only", only scale the outputs without centering (in case we set a non-zero mean function with empirical mean).
- If "mean_only", only remove the mean without scaling.
- If nothing, do not standardize the outputs.
hyper_params: Specifies how to handle hyperparameters.
- If "all", re-optimize hyperparameters every 10 iterations.
- If "lengthscaleonly", only optimize the lengthscale.
- If nothing, do not re-optimize hyperparameters.
num_restarts_HP::Int: Number of random restarts for hyperparameter optimization.
ad_backend::Symbol: The automatic differentiation backend to use for hyperparameter optimization.

returns:

BO::BOStruct: The updated Bayesian Optimization problem after optimization.
acqf_list::Vector: List of acquisition function values at each iteration.
standard_params::Tuple: Tuple containing the mean and standard deviation used for standardization

source

Abstract Interface

AbstractBayesOpt.AbstractAcquisition — Type

AbstractAcquisition

Abstract type for acquisition functions used in Bayesian optimization.

Concrete implementation should subtype this and implement the following methods:

(acq::AbstractAcquisition)(surrogate::AbstractSurrogate, x::AbstractVector): Evaluate the acquisition function at point x using the surrogate model. This should also work for a single real input x::Real if working in 1D, in which case it is treated as a one-dimensional input vector. via the abstract method defined below.
update(acq::AbstractAcquisition, ys::AbstractVector, model::AbstractSurrogate): Update the acquisition function with new observations ys and the current surrogate model.
Base.copy(acq::AbstractAcquisition): Create a copy of the acquisition function.

source

AbstractBayesOpt.AbstractDomain — Type

AbstractDomain

An abstract type for defining the domain over which the optimization is performed.

Concrete implementations should subtype this and define the necessary properties:

lower: The lower bounds of the domain.
upper: The upper bounds of the domain.

as well as creating its constructor.

Other methods can be added as needed depending on the use case.

source

AbstractBayesOpt.AbstractSurrogate — Type

AbstractSurrogate

Abstract type for surrogate models used in Bayesian optimization.

Concrete implementation should subtype this and implement the following methods:

update(model::AbstractSurrogate, xs::AbstractVector, ys::AbstractVector): Update the surrogate model with new data points xs and corresponding observations ys.
posterior_mean(surrogate::AbstractSurrogate, x::AbstractVector): Compute the posterior mean of the surrogate model at point x.
posterior_var(surrogate::AbstractSurrogate, x::AbstractVector): Compute the posterior variance of the surrogate model at point x.
nlml(surrogate::AbstractSurrogate, params::AbstractVector, xs::AbstractVector, ys::AbstractVector): Compute the negative log marginal likelihood of the surrogate model given hyperparameters params, input data xs, and observations ys.

If you wish to standardize the outputs, you can also implement:

std_y(model::AbstractSurrogate): Get the standard deviation used for standardizing the outputs in the surrogate model.
get_mean_std(model::AbstractSurrogate): Get the mean and standard deviation used for standardizing the outputs in the surrogate model.

Other methods can be added as needed depending on the use case, and we refer to the impelementations of StandardGP and GradientGP for examples.

source

Surrogates

AbstractBayesOpt.StandardGP — Type

StandardGP{T}(gp::AbstractGPs.GP, noise_var::T, gpx::Union{Nothing,AbstractGPs.PosteriorGP}) <: AbstractSurrogate

Implementation of the Abstract structures for the standard GP.

Attributes:

gp::AbstractGPs.GP: The underlying Gaussian Process model.
noise_var::T: The noise variance of the observations.
gpx::Union{Nothing,AbstractGPs.PosteriorGP}: The posterior GP after conditioning on data, nothing if not conditioned yet.

source

AbstractBayesOpt.GradientGP — Type

struct GradientGP{T, G<:AbstractGPs.GP} <: AbstractSurrogate

Gradient-enhanced Gaussian Process surrogate model.

Attributes:

gp::G: The underlying Gaussian Process model.
noise_var::T: The noise variance of the observations.
p::Int: The number of outputs (1 for function value + d for gradients).
gpx::Union{Nothing, AbstractGPs.PosteriorGP}: The posterior GP after conditioning on data, nothing if not conditioned yet.

Note: gpx is the posterior GP after conditioning on data, nothing if not conditioned yet

This relies on MOGP from AbstractGPs.jl and KernelFunctions.jl.

source

AbstractBayesOpt.posterior_mean — Function

posterior_mean(model::StandardGP, x::X) where {X}

Compute the posterior mean of the GP at a new input point.

Arguments:

model::StandardGP: The GP model.
x::X: A new input point where the prediction is to be made.

returns:

mean: The posterior mean prediction at the input point.

source

posterior_mean(model::StandardGP, x::AbstractVector)

Compute the posterior mean of the GP at set of new input points.

Arguments:

model::StandardGP: The GP model.
x::AbstractVector: A vector of new input points where predictions are to be made.

returns:

mean: The posterior mean predictions at the input points.

source

posterior_mean(model::GradientGP, x)

Compute the function mean predictions of the GP model at new input points.

Arguments:

model::GradientGP: The GP model.
x: A vector of new input points where predictions are to be made.

returns:

mean::Vector: The mean predictions (function value only)

source

AbstractBayesOpt.posterior_var — Function

posterior_var(model::StandardGP, x::X) where {X}

Compute the posterior variance of the GP at a new input point.

Arguments:

model::StandardGP: The GP model.
x::X: A new input point where the prediction is to be made.

returns:

var: The posterior variance prediction at the input point.

source

posterior_var(model::StandardGP, x::AbstractVector)

Compute the posterior variance of the GP at set of new input points.

Arguments:

model::StandardGP: The GP model.
x::AbstractVector: A vector of new input points where predictions are to be made

returns:

var: The posterior variance predictions at the input points.

source

posterior_var(model::GradientGP, x)

Compute the function variance predictions of the GP model at new input points.

Arguments:

model::GradientGP: The GP model.
x: A vector of new input points where predictions are to be made.

returns:

var::Vector: The variance predictions (function value only)

source

AbstractBayesOpt.nlml — Function

nlml(model::StandardGP, params, xs::AbstractVector, ys::AbstractVector)

Compute the negative log marginal likelihood (NLML) of the GP model given hyperparameters.

Arguments:

model::StandardGP: The GP model.
params: A vector containing the log lengthscale and log scale parameters.
xs::AbstractVector: The input data points.
ys::AbstractVector: The observed function values.

returns:

nlml::Float64: The negative log marginal likelihood of the model.

source

nlml(model::GradientGP, params, xs::AbstractVector, ys::AbstractVector)

Compute the negative log marginal likelihood (NLML) of the GP model given hyperparameters.

Arguments:

model::GradientGP: The GP model.
params: Parameters containing the log lengthscale and log scale.
xs::AbstractVector: The input data points.
ys::AbstractVector: The observed function values and gradients.

returns:

nlml::Float64: The negative log marginal likelihood of the model.

source

Kernels

AbstractBayesOpt.ApproxMatern52Kernel — Type

ApproxMatern52Kernel{M}(metric::M) <: KernelFunctions.SimpleKernel

Approximate Matern 5/2 kernel using a second-order Taylor expansion around d=0.

Attributes:

metric: The distance metric to be used, defaults to squared Euclidean distance.

source

AbstractBayesOpt.ApproxMatern72Kernel — Type

ApproxMatern72Kernel{M}(metric::M) <: KernelFunctions.SimpleKernel

Approximate Matern 7/2 kernel using a second-order Taylor expansion around d=0.

Attributes:

metric: The distance metric to be used, defaults to squared Euclidean distance.

source

AbstractBayesOpt.ADMatern52Kernel — Type

ADMatern52Kernel{M} <: KernelFunctions.SimpleKernel

Matern 5/2 kernel with custom differentiation rules for gradient computations.

Attributes:

metric: The distance metric to be used, defaults to squared Euclidean distance.

source

AbstractBayesOpt.ADMatern72Kernel — Type

ADMatern72Kernel{M} <: KernelFunctions.SimpleKernel

Matern 7/2 kernel with custom differentiation rules for gradient computations.

Attributes:

metric: The distance metric to be used, defaults to squared Euclidean distance.

source

AbstractBayesOpt.gradConstMean — Type

gradConstMean{V}(c::V)

Custom mean function for the GradientGP model. Returns a constant per-output mean across MO inputs (function value + gradients). The first element corresponds to the function value, the following ones to the gradient outputs.

Use gradConstMean([μ; zeros(d)]) to set a constant prior mean μ for the function value and zero for the gradients.

Attributes:

c::V: A vector of constants for each output (function value + gradients).

source

AbstractBayesOpt.gradKernel — Type

gradKernel{K}(base_kernel::K) <: MOKernel

Custom kernel function for the GradientGP model that handles both function values and gradients.

Arguments:

base_kernel::KernelFunctions.Kernel: The base kernel function to be used.

returns:

gradKernel: An instance of the custom gradient kernel function.

source

AbstractBayesOpt.posterior_grad_mean — Function

posterior_grad_mean(model::GradientGP, x)

Compute the mean predictions of the GP model at new input points, including gradients.

Arguments:

model::GradientGP: The GP model.
x: A vector of new input points where predictions are to be made.

returns:

mean::Vector: The mean predictions

source

AbstractBayesOpt.posterior_grad_var — Function

posterior_grad_var(model::GradientGP, x)

Compute the variance predictions of the GP model at new input points, including gradients.

Arguments:

model::GradientGP: The GP model.
x: A vector of new input points where predictions are to be made.

returns:

var::Vector: The variance predictions

source

AbstractBayesOpt.posterior_grad_cov — Function

posterior_grad_cov(model::GradientGP, x)

Compute the covariance matrix of the GP model at new input points, including gradients.

Arguments:

model::GradientGP: The GP model.
x: A vector of new input points where predictions are to be made.

returns:

cov::Matrix: The covariance matrix of the predictions

source

Acquisition Functions

AbstractBayesOpt.EnsembleAcquisition — Type

EnsembleAcquisition(weights::Vector{Float64}, acqs::Vector{AbstractAcquisition}) <: AbstractAcquisition

An ensemble acquisition function combines multiple acquisition functions, each weighted by a specified factor,

Attributes:

weights::Vector{Float64}: A vector of non-negative weights for each acquisition function. The weights are normalized to sum to 1.
acquisitions::Vector{AbstractAcquisition}: A vector of acquisition functions to be combined.

Remark: All weights must be non-negative.

source

AbstractBayesOpt.ExpectedImprovement — Type

ExpectedImprovement{Y}(ξ::Y, best_y::Y) <: AbstractAcquisition

Expected Improvement acquisition function.

Attributes:

ξ::Y: Exploration parameter
best_y::Y: Best observed objective value

References: Jones et al., 1998

source

AbstractBayesOpt.GradientNormUCB — Type

GradientNormUCB{Y}(β::Y) <: AbstractAcquisition

Acquisition function implementing the Squared 2-norm of the gradient with Upper Confidence Bound (UCB) exploration strategy.

Attributes:

β::Y: Exploration-exploitation balance parameter

References: Derived by Van Dieren, E. but open to previous references if existing. Originally proposed by Makrygiorgos et al., 2023 but adapted to the squared 2-norm of the gradient.

source

AbstractBayesOpt.ProbabilityImprovement — Type

ProbabilityImprovement{Y}(ξ::Y, best_y::Y) <: AbstractAcquisition

Attributes:

ξ::Y: Exploration parameter
best_y::Y: Best observed objective value

References: Kushner, 1964

source

AbstractBayesOpt.UpperConfidenceBound — Type

UpperConfidenceBound{Y}(β::Y) <: AbstractAcquisition

Upper Confidence Bound (UCB) acquisition function.

Attributes:

β::Y: Exploration-exploitation balance parameter

References: Srinivas et al., 2012

source

Domains

Continuous domain

AbstractBayesOpt.ContinuousDomain — Type

ContinuousDomain(lower::Vector{Float64}, upper::Vector{Float64}, bounds::Vector{Tuple{Float64,Float64}}) <: AbstractDomain

A concrete implementation of AbstractDomain for continuous domains.

Attributes:

lower::Vector{Float64}: The lower bounds of the domain.
upper::Vector{Float64}: The upper bounds of the domain.
bounds::Vector{Tuple{Float64,Float64}}: A vector of tuples representing the (lower, upper) bounds for each dimension.

Constructor:

ContinuousDomain(lower::Vector{Float64}, upper::Vector{Float64}): Creates a ContinuousDomain instance given lower and upper bounds. Performs sanity checks to ensure the bounds are valid.

source