Search results
Results From The WOW.Com Content Network
In probability and statistics, the Hellinger distance (closely related to, although different from, the Bhattacharyya distance) is used to quantify the similarity between two probability distributions. It is a type of f-divergence. The Hellinger distance is defined in terms of the Hellinger integral, which was introduced by Ernst Hellinger in 1909.
Applying this theorem to KL-divergence yields the Donsker–Varadhan representation. Attempting to apply this theorem to the general α {\displaystyle \alpha } -divergence with α ∈ ( − ∞ , 0 ) ∪ ( 0 , 1 ) {\displaystyle \alpha \in (-\infty ,0)\cup (0,1)} does not yield a closed-form solution.
In mathematical statistics, the Kullback–Leibler (KL) divergence (also called relative entropy and I-divergence [1]), denoted (), is a type of statistical distance: a measure of how much a model probability distribution Q is different from a true probability distribution P.
In statistics, probability theory, and information theory, a statistical distance quantifies the distance between two statistical objects, which can be two random variables, or two probability distributions or samples, or the distance can be between an individual sample point and a population or a wider sample of points.
The total variation distance (or half the norm) arises as the optimal transportation cost, when the cost function is (,) =, that is, ‖ ‖ = (,) = {(): =, =} = [], where the expectation is taken with respect to the probability measure on the space where (,) lives, and the infimum is taken over all such with marginals and , respectively.
The information geometry definition of divergence (the subject of this article) was initially referred to by alternative terms, including "quasi-distance" Amari (1982, p. 369) and "contrast function" Eguchi (1985), though "divergence" was used in Amari (1985) for the α-divergence, and has become standard for the general class.
Viewing the Kullback–Leibler divergence as a measure of distance, the I-projection is the "closest" distribution to q of all the distributions in P. The I-projection is useful in setting up information geometry , notably because of the following inequality, valid when P is convex: [ 1 ]
By Chentsov’s theorem, the Fisher information metric on statistical models is the only Riemannian metric (up to rescaling) that is invariant under sufficient statistics. [3] [4] It can also be understood to be the infinitesimal form of the relative entropy (i.e., the Kullback–Leibler divergence); specifically, it is the Hessian of