public class BFGS
extends java.lang.Object
The BFGS method belongs to quasi-Newton methods, a class of hill-climbing optimization techniques that seek a stationary point of a (preferably twice continuously differentiable) function. For such problems, a necessary condition for optimality is that the gradient be zero. Newton's method and the BFGS methods are not guaranteed to converge unless the function has a quadratic Taylor expansion near an optimum. However, BFGS has proven to have good performance even for non-smooth optimizations.
In quasi-Newton methods, the Hessian matrix of second derivatives is not computed. Instead, the Hessian matrix is approximated using updates specified by gradient evaluations (or approximate gradient evaluations). Quasi-Newton methods are generalizations of the secant method to find the root of the first derivative for multidimensional problems. In multi-dimensional problems, the secant equation does not specify a unique solution, and quasi-Newton methods differ in how they constrain the solution. The BFGS method is one of the most popular members of this class.
Like the original BFGS, the limited-memory BFGS (L-BFGS) uses an
estimation to the inverse Hessian matrix to steer its search
through variable space, but where BFGS stores a dense n × n
approximation to the inverse Hessian (n being the number of
variables in the problem), L-BFGS stores only a few vectors
that represent the approximation implicitly. Due to its resulting
linear memory requirement, the L-BFGS method is particularly well
suited for optimization problems with a large number of variables
(e.g., > 1000). Instead of the inverse Hessian H_k, L-BFGS
maintains * a history of the past m updates of the position
x and gradient ∇f(x), where generally the
history size m can be small (often m < 10).
These updates are used to implicitly do operations requiring the
H_k-vector product.
| Constructor and Description |
|---|
BFGS() |
| Modifier and Type | Method and Description |
|---|---|
static double |
minimize(DifferentiableMultivariateFunction func,
double[] x,
double gtol,
int maxIter)
This method solves the unconstrained minimization problem
|
static double |
minimize(DifferentiableMultivariateFunction func,
int m,
double[] x,
double[] l,
double[] u,
double gtol,
int maxIter)
This method solves the bound constrained minimization problem
using the L-BFGS-B method.
|
static double |
minimize(DifferentiableMultivariateFunction func,
int m,
double[] x,
double gtol,
int maxIter)
This method solves the unconstrained minimization problem
|
public static double minimize(DifferentiableMultivariateFunction func, double[] x, double gtol, int maxIter)
min f(x), x = (x1,x2,...,x_n),
using the BFGS method.func - the function to be minimized.x - on initial entry this must be set by the user to the values
of the initial estimate of the solution vector. On exit, it
contains the values of the variables at the best point found
(usually a solution).gtol - the convergence tolerance on zeroing the gradient.maxIter - the maximum number of iterations.public static double minimize(DifferentiableMultivariateFunction func, int m, double[] x, double gtol, int maxIter)
min f(x), x = (x1,x2,...,x_n),
using the limited-memory BFGS method. The method is especially
effective on problems involving a large number of variables. In
a typical iteration of this method an approximation Hk to the
inverse of the Hessian is obtained by applying m BFGS updates to
a diagonal matrix Hk0, using information from the previous M steps.
The user specifies the number m, which determines the amount of
storage required by the routine.func - the function to be minimized.m - the number of corrections used in the L-BFGS update.
Values of m less than 3 are not recommended;
large values of m will result in excessive
computing time. 3 <= m <= 7 is recommended.
A common choice for m is m = 5.x - on initial entry this must be set by the user to the values
of the initial estimate of the solution vector. On exit with
iflag = 0, it contains the values of the variables
at the best point found (usually a solution).gtol - the convergence tolerance on zeroing the gradient.maxIter - the maximum number of iterations.public static double minimize(DifferentiableMultivariateFunction func, int m, double[] x, double[] l, double[] u, double gtol, int maxIter)
func - the function to be minimized.m - the number of corrections used in the L-BFGS update.
Values of m less than 3 are not recommended;
large values of m will result in excessive
computing time. 3 <= m <= 7 is recommended.
A common choice for m is m = 5.x - on initial entry this must be set by the user to the values
of the initial estimate of the solution vector. On exit with
iflag = 0, it contains the values of the variables
at the best point found (usually a solution).l - the lower bound.u - the upper bound.gtol - the convergence tolerance on zeroing the gradient.maxIter - the maximum number of iterations.