The proof of (4) is due to Csiszar [2] and Kullback [3], with´ Kemperman [4] independently a bit later.

Therefore, we get jjPt(x;) ˇjj TV = max A t Clearly, the total variation distance is not restricted to the 1.2 Total Variation Distance¶ To measure the difference between the two distributions, we will compute a quantity called the total variation distance between them. 1.1 Total Variation Distance De nition { For probability distributions and on the same probability space , the total variation distance between and is de ned as jj jj TV = sup Aˆ j (A) (A)j: Properties {i. jjjj TV is a metric on the set of probability measures on . 2.2.1 Total variation distance De nition 2.1 6 d TV(P;Q) = jjP Qjj TV:= sup A2A jP(A) Q(A)j: 1See Section 2.2 in [T2008], p.79-80 2 Total Variation Distance In order to prove convergence to stationary distributions, we require a notion of distance between distributions. De nition 3 (Total Variation Distance). 2.2 Distance between probability distributions5 Let P;Qbe two probability measures on (;A), having densities pand qwith respect to some dominating measure (i.e. Bounds in total variation distance for discrete-time processes on the sequence space Ian Flint Nicolas Privaulty Giovanni Luca Torrisiz Abstract Let P and Pebe the laws of two discrete-time stochastic processes defined on the mation is bounded, small total variation distance does imply small relative entropy, as we see next. The total variation distance (or half the norm) arises as the optimal transportation cost, when the cost function is (,) = ≠, that is, ‖ − ‖ = (,) = [≠], where the expectation is taken with respect to the probability measure on the (,) . The classical choice for this is the so called total variation distance (which you were introduced to in the problem sets). The total variation of any probability measure is exactly one, therefore it is not interesting as a means of investigating the properties of such measures. The total variation distance between and is k k tv = sup AˆS j (A) (A)j [2] Lectures For the proof of d(t) 6 d (t), note that e.g., = P+ Q. Spectral Analysis 18 3.1.

Convergence Rates for the Ehrenfest Urn and Random-to-Top 16 2.7. Furthermore, this section provides With 1 1 defined in (33), we have 1 2 jP Qj 1 1 log 1 1 D(PkQ) (39) p 1 loge D(PkQ) (40) Proof: The function zlogz Couplings for the Ehrenfest Urn and Random-to-Top Shuffling 12 2.4. However, when μ and ν are probability measures, the total variation distance of probability measures can be defined as ‖ − ‖ where the norm is the total variation norm of signed measures. The total variation distance between two probability measures and on R is de ned as TV( ; ) := sup A2B j (A) (A)j: Here D= f1 A: A2Bg: Note that this ranges in [0;1]. Lebesgue measure on Rd). Coupling and Total Variation Distance 9 2.2. The Coupon Collector’s Problem 13 2.5. Exercises 17 3. In a recent paper by Chambolle et al (2017 Inverse Problems 33 015002) it was proven that if the subgradient of the total variation at the noise free data is not empty, the level-sets of the total variation denoised solutions converge to the level-sets of the noise free data with respect to the Hausdorff distance. However, when μ and ν are probability measures, the total variation distance of probability measures can be defined as ‖ − ‖ where the norm is the total variation norm of signed measures. Exercises 15 2.6. Proof: d (t) 2d(t) is immediate from the triangle inequality for the total variation distance. The total variation of any probability measure is exactly one, therefore it is not interesting as a means of investigating the properties of such measures. 2 Chapter 3: Total variation distance between measures total variation distance has properties that will be familiar to students of the Neyman-Pearson approach to hypothesis testing. To compute the total variation distance, take the difference between the two proportions in each category, add up the absolute values of all the differences, and then divide the sum by 2. Unfortunately, in [15,9] the threshold problem for the total variation distance is proven to be NP-hard in the case of MCs, and to the best of our knowledge, its decidability is still an open problem. Bounds in total variation distance for discrete-time processes on the sequence space Ian Flint Nicolas Privaulty Giovanni Luca Torrisiz Abstract Let P and Pebe the laws of two discrete-time stochastic processes defined on the sequence Given two distributions ; 2P, we de ne Some upper bounds for the total variation distance between two Poisson distributions withdifferent means are the following: d TV t N(t+x),N(t) ≤min 1−e−x, +x t P N(u)=u du ≤ t+x t P N(u)=u du≤min ⎧ ⎨ ⎩ x, 2 e √ t+x− √ t ⎫ ⎬ ⎭, t However, as is shown in the following, if it is not the case (i.e., the local distance is smaller than the total variation distance), then the bound in Theorem 1 is necessarily not tight. Examiner: N. Berestycki Marks Comments (a) De nition. 1-distance between the probability vectors Pand Q. kP Qk 1 = X i2[n] jp i q ij: The total variation distance, denoted by ( P;Q) (and sometimes by kP Qk TV), is half the above quantity.