proximal_convex_conj_kl_cross_entropy¶
- odl.solvers.nonsmooth.proximal_operators.proximal_convex_conj_kl_cross_entropy(space, lam=1, g=None)[source]¶
Proximal factory of the convex conj of cross entropy KL divergence.
Function returning the proximal factory of the convex conjugate of the functional F, where F is the cross entropy Kullback-Leibler (KL) divergence given by:
F(x) = sum_i (x_i ln(pos(x_i)) - x_i ln(g_i) + g_i - x_i) + ind_P(x)
with
x
andg
in the linear spaceX
, andg
non-negative. Here,pos
denotes the nonnegative part, andind_P
is the indicator function for nonnegativity.- Parameters:
- space
TensorSpace
Space X which is the domain of the functional F
- lampositive float, optional
Scaling factor.
- g
space
element, optional Data term, positive. If None it is take as the one-element.
- space
- Returns:
- prox_factoryfunction
Factory for the proximal operator to be initialized.
See also
proximal_convex_conj_kl
proximal for related functional
Notes
The functional is given by the expression
The indicator function is used to restrict the domain of such that is defined over whole space . The non-negativity thresholding is used to define in the real numbers.
Note that the functional is not well-defined without a prior g. Hence, if g is omitted this will be interpreted as if g is equal to the one-element.
The convex conjugate of is
where is the variable dual to .
The proximal operator of the convex conjugate of is
where is the step size-like parameter, is the weighting in front of the function , and is the Lambert W function (see, for example, the Wikipedia article).
For real-valued input x, the Lambert function is defined only for , and it has two branches for values . However, for inteneded use-cases, where and are positive, the argument of will always be positive.
Wikipedia article on Kullback Leibler divergence. For further information about the functional, see for example this article.
The KL cross entropy functional , described above, is related to another functional functional also know as KL divergence. This functional is often used as data discrepancy term in inverse problems, when data is corrupted with Poisson noise. This functional is obtained by changing place of the prior and the variable. See the See Also section.