Source code for odl.solvers.nonsmooth.forward_backward

# Copyright 2014-2019 The ODL contributors
#
# This file is part of ODL.
#
# This Source Code Form is subject to the terms of the Mozilla Public License,
# v. 2.0. If a copy of the MPL was not distributed with this file, You can
# obtain one at https://mozilla.org/MPL/2.0/.

"""Optimization methods based on a forward-backward splitting scheme."""

from __future__ import print_function, division, absolute_import

from odl.operator import Operator


__all__ = ('forward_backward_pd',)


[docs]def forward_backward_pd(x, f, g, L, h, tau, sigma, niter, callback=None, **kwargs): r"""The forward-backward primal-dual splitting algorithm. The algorithm minimizes the sum of several convex functionals composed with linear operators:: min_x f(x) + sum_i g_i(L_i x) + h(x) where ``f``, ``g_i`` are convex functionals, ``L_i`` are linear operators, and ``h`` is a convex and differentiable functional. The method can also be used to solve the more general problem:: min_x f(x) + sum_i (g_i @ l_i)(L_i x) + h(x) where ``l_i`` are strongly convex functionals and @ is the infimal convolution:: (g @ l)(x) = inf_y { g(y) + l(x-y) } Note that the strong convexity of ``l_i`` makes the convex conjugate ``l_i^*`` differentiable; see the Notes section for more information on this. Parameters ---------- x : `LinearSpaceElement` Initial point, updated in-place. f : `Functional` The functional ``f``. Needs to have ``f.proximal``. g : sequence of `Functional`'s The functionals ``g_i``. Needs to have ``g_i.convex_conj.proximal``. L : sequence of `Operator`'s' Sequence of linear operators ``L_i``, with as many elements as ``g``. h : `Functional` The functional ``h``. Needs to have ``h.gradient``. tau : float Step size-like parameter for ``f``. sigma : sequence of floats Sequence of step size-like parameters for the sequence ``g``. niter : int Number of iterations. callback : callable, optional Function called with the current iterate after each iteration. Other Parameters ---------------- l : sequence of `Functional`'s, optional The functionals ``l_i``. Needs to have ``g_i.convex_conj.gradient``. If omitted, the simpler problem without ``l_i`` will be considered. Notes ----- The mathematical problem to solve is .. math:: \min_x f(x) + \sum_{i=0}^n (g_i \Box l_i)(L_i x) + h(x), where :math:`f`, :math:`g_i`, :math:`l_i` and :math:`h` are functionals and :math:`L_i` are linear operators. The infimal convolution :math:`g \Box l` is defined by .. math:: (g \Box l)(x) = \inf_y g(y) + l(x - y). The exact conditions on the involved functionals are as follows: :math:`f` and :math:`g_i` are proper, convex and lower semicontinuous, and :math:`h` is convex and differentiable with :math:`\eta^{-1}`-Lipschitz continuous gradient, :math:`\eta > 0`. The optional operators :math:`\nabla l_i^*` need to be :math:`\nu_i`-Lipschitz continuous. Note that in the paper, the condition is formulated as :math:`l_i` being proper, lower semicontinuous, and :math:`\nu_i^{-1}`-strongly convex, which implies that :math:`l_i^*` have :math:`\nu_i`-Lipschitz continuous gradients. If the optional operators :math:`\nabla l_i^*` are omitted, the simpler problem without :math:`l_i` will be considered. Mathematically, this is done by taking :math:`l_i` to be the functionals that are zero only in the zero element and :math:`\infty` otherwise. This gives that :math:`l_i^*` are the zero functionals, and hence the corresponding gradients are the zero operators. To guarantee convergence, the parameters :math:`\tau`, :math:`\sigma` and :math:`L_i` need to satisfy .. math:: 2 \min \{ \frac{1}{\tau}, \frac{1}{\sigma_1}, \ldots, \frac{1}{\sigma_m} \} \cdot \min\{ \eta, \nu_1, \ldots, \nu_m \} \cdot \sqrt{1 - \tau \sum_{i=1}^n \sigma_i ||L_i||^2} > 1, where, if the simpler problem is considered, all :math:`\nu_i` can be considered to be :math:`\infty`. For reference on the forward-backward primal-dual algorithm, see [BC2015]. For more on proximal operators and algorithms see [PB2014]. See Also -------- odl.solvers.nonsmooth.primal_dual_hybrid_gradient.pdhg : Solver for similar problems without differentiability in any of the terms. odl.solvers.nonsmooth.douglas_rachford.douglas_rachford_pd : Solver for similar problems without differentiability in any of the terms. References ---------- [BC2015] Bot, R I, and Csetnek, E R. *On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems*. Optimization, 64.1 (2015), pp 5--23. [PB2014] Parikh, N, and Boyd, S. *Proximal Algorithms*. Foundations and Trends in Optimization, 1 (2014), pp 127-239. """ # Problem size m = len(L) # Validate input if not all(isinstance(op, Operator) for op in L): raise ValueError('`L` not a sequence of operators') if not all(op.is_linear for op in L): raise ValueError('not all operators in `L` are linear') if not all(x in op.domain for op in L): raise ValueError('`x` not in the domain of all operators in `L`') if len(sigma) != m: raise ValueError('len(sigma) != len(L)') if len(g) != m: raise ValueError('len(prox_cc_g) != len(L)') # Extract operators prox_cc_g = [gi.convex_conj.proximal for gi in g] grad_h = h.gradient prox_f = f.proximal l = kwargs.pop('l', None) if l is not None: if len(l) != m: raise ValueError('`grad_cc_l` not same length as `L`') grad_cc_l = [li.convex_conj.gradient for li in l] if kwargs: raise TypeError('unexpected keyword argument: {}'.format(kwargs)) # Pre-allocate values v = [Li.range.zero() for Li in L] y = x.space.zero() for k in range(niter): x_old = x tmp_1 = grad_h(x) + sum(Li.adjoint(vi) for Li, vi in zip(L, v)) prox_f(tau)(x - tau * tmp_1, out=x) y.lincomb(2.0, x, -1, x_old) for i in range(m): if l is not None: # In this case gradients were given. tmp_2 = sigma[i] * (L[i](y) - grad_cc_l[i](v[i])) else: # In this case gradients were not given. Therefore the gradient # step is omitted. For more details, see the documentation. tmp_2 = sigma[i] * L[i](y) prox_cc_g[i](sigma[i])(v[i] + tmp_2, out=v[i]) if callback is not None: callback(x)