GradientDescent¶
- class GradientDescent(maxiter=100, learning_rate=0.01, tol=1e-07, callback=None, perturbation=None)[source]¶
Bases:
SteppableOptimizer
The gradient descent minimization routine.
For a function \(f\) and an initial point \(\vec\theta_0\), the standard (or “vanilla”) gradient descent method is an iterative scheme to find the minimum \(\vec\theta^*\) of \(f\) by updating the parameters in the direction of the negative gradient of \(f\)
\[\vec\theta_{n+1} = \vec\theta_{n} - \eta_n \vec\nabla f(\vec\theta_{n}),\]for a small learning rate \(\eta_n > 0\).
You can either provide the analytic gradient \(\vec\nabla f\) as
jac
in theminimize()
method, or, if you do not provide it, use a finite difference approximation of the gradient. To adapt the size of the perturbation in the finite difference gradients, set theperturbation
property in the initializer.This optimizer supports a callback function. If provided in the initializer, the optimizer will call the callback in each iteration with the following information in this order: current number of function values, current parameters, current function value, norm of current gradient.
Examples
A minimum example that will use finite difference gradients with a default perturbation of 0.01 and a default learning rate of 0.01.
from qiskit_machine_learning.optimizers import GradientDescent def f(x): return (np.linalg.norm(x) - 1) ** 2 initial_point = np.array([1, 0.5, -0.2]) optimizer = GradientDescent(maxiter=100) result = optimizer.minimize(fun=fun, x0=initial_point) print(f"Found minimum {result.x} at a value" "of {result.fun} using {result.nfev} evaluations.")
An example where the learning rate is an iterator and we supply the analytic gradient. Note how much faster this convergences (i.e. less
nfev
) compared to the previous example.from qiskit_machine_learning.optimizers import GradientDescent def learning_rate(): power = 0.6 constant_coeff = 0.1 def power_law(): n = 0 while True: yield constant_coeff * (n ** power) n += 1 return power_law() def f(x): return (np.linalg.norm(x) - 1) ** 2 def grad_f(x): return 2 * (np.linalg.norm(x) - 1) * x / np.linalg.norm(x) initial_point = np.array([1, 0.5, -0.2]) optimizer = GradientDescent(maxiter=100, learning_rate=learning_rate) result = optimizer.minimize(fun=fun, jac=grad_f, x0=initial_point) print(f"Found minimum {result.x} at a value" "of {result.fun} using {result.nfev} evaluations.")
An other example where the evaluation of the function has a chance of failing. The user, with specific knowledge about his function can catch this errors and handle them before passing the result to the optimizer.
import random import numpy as np from qiskit_machine_learning.optimizers import GradientDescent def objective(x): if random.choice([True, False]): return None else: return (np.linalg.norm(x) - 1) ** 2 def grad(x): if random.choice([True, False]): return None else: return 2 * (np.linalg.norm(x) - 1) * x / np.linalg.norm(x) initial_point = np.random.normal(0, 1, size=(100,)) optimizer = GradientDescent(maxiter=20) optimizer.start(x0=initial_point, fun=objective, jac=grad) while optimizer.continue_condition(): ask_data = optimizer.ask() evaluated_gradient = None while evaluated_gradient is None: evaluated_gradient = grad(ask_data.x_center) optimizer.state.njev += 1 optimizer.state.nit += 1 tell_data = TellData(eval_jac=evaluated_gradient) optimizer.tell(ask_data=ask_data, tell_data=tell_data) result = optimizer.create_result()
Users that aren’t dealing with complicated functions and who are more familiar with step by step optimization algorithms can use the
step()
method which wraps theask()
andtell()
methods. In the same spirit the methodminimize()
will optimize the function and return the result.To see other libraries that use this interface one can visit: https://optuna.readthedocs.io/en/stable/tutorial/20_recipes/009_ask_and_tell.html
- Parameters:
maxiter (int) – The maximum number of iterations.
learning_rate (float | list[float] | np.ndarray | Callable[[], Generator[float, None, None]]) – A constant, list, array or factory of generators yielding learning rates for the parameter updates. See the docstring for an example.
tol (float) – If the norm of the parameter update is smaller than this threshold, the optimizer has converged.
perturbation (float | None) – If no gradient is passed to
minimize()
the gradient is approximated with a forward finite difference scheme withperturbation
perturbation in both directions (defaults to 1e-2 if required). Ignored when we have an explicit function for the gradient.
- Raises:
ValueError – If
learning_rate
is an array and its length is less thanmaxiter
.
Attributes
- bounds_support_level¶
Returns bounds support level
- gradient_support_level¶
Returns gradient support level
- initial_point_support_level¶
Returns initial point support level
- is_bounds_ignored¶
Returns is bounds ignored
- is_bounds_required¶
Returns is bounds required
- is_bounds_supported¶
Returns is bounds supported
- is_gradient_ignored¶
Returns is gradient ignored
- is_gradient_required¶
Returns is gradient required
- is_gradient_supported¶
Returns is gradient supported
- is_initial_point_ignored¶
Returns is initial point ignored
- is_initial_point_required¶
Returns is initial point required
- is_initial_point_supported¶
Returns is initial point supported
- perturbation¶
Returns the perturbation.
This is the perturbation used in the finite difference gradient approximation.
- setting¶
Return setting
- settings¶
- state¶
Return the current state of the optimizer.
- tol¶
Returns the tolerance of the optimizer.
Any step with smaller stepsize than this value will stop the optimization.
Methods
- ask()[source]¶
Returns an object with the data needed to evaluate the gradient.
If this object contains a gradient function the gradient can be evaluated directly. Otherwise approximate it with a finite difference scheme.
- Return type:
- continue_condition()[source]¶
Condition that indicates the optimization process should come to an end.
When the stepsize is smaller than the tolerance, the optimization process is considered finished.
- Returns:
True
if the optimization process should continue,False
otherwise.- Return type:
- create_result()[source]¶
Creates a result of the optimization process.
This result contains the best point, the best function value, the number of function/gradient evaluations and the number of iterations.
- Returns:
The result of the optimization process.
- Return type:
- evaluate(ask_data)[source]¶
Evaluates the gradient.
It does so either by evaluating an analytic gradient or by approximating it with a finite difference scheme. It will either add
1
to the number of gradient evaluations or addN+1
to the number of function evaluations (Where N is the dimension of the gradient).
- static gradient_num_diff(x_center, f, epsilon, max_evals_grouped=None)¶
We compute the gradient with the numeric differentiation in the parallel way, around the point x_center.
- Parameters:
- Returns:
the gradient computed
- Return type:
grad
- minimize(fun, x0, jac=None, bounds=None)¶
Minimizes the function.
For well behaved functions the user can call this method to minimize a function. If the user wants more control on how to evaluate the function a custom loop can be created using
ask()
andtell()
and evaluating the function manually.- Parameters:
- Returns:
Object containing the result of the optimization.
- Return type:
- print_options()¶
Print algorithm-specific options.
- set_max_evals_grouped(limit)¶
Set max evals grouped
- set_options(**kwargs)¶
Sets or updates values in the options dictionary.
The options dictionary may be used internally by a given optimizer to pass additional optional values for the underlying optimizer/optimization function used. The options dictionary may be initially populated with a set of key/values when the given optimizer is constructed.
- Parameters:
kwargs (dict) – options, given as name=value.
- start(fun, x0, jac=None, bounds=None)[source]¶
Populates the state of the optimizer with the data provided and sets all the counters to 0.
- step()¶
Performs one step in the optimization process.
This method composes
ask()
,evaluate()
, andtell()
to make a “step” in the optimization process.
- tell(ask_data, tell_data)[source]¶
Updates
x
by an amount proportional to the learning rate and value of the gradient at that point.- Parameters:
- Raises:
ValueError – If the gradient passed doesn’t have the right dimension.