Gaussian tail bounds

Before we even start with gaussian distributions and figuring out their bounds, we will go over some basic probability theory inequalities. In this section we start off with the Markov's and Chebyshev's Inequality.

Notation: \(P[\cdot]\) denotes the Cumulative Density Function (CDF) and \(p(\cdot)\) denotes the PDF
Markov's Inequality: \(P[x \geq t] \leq \dfrac{\mathrm{E}[x]}{t}\)
Proof: \[\begin{align*} \mathrm{E}(X) &= \int_{-\infty}^{\infty} xp(x) dx\\ &= \int_{-\infty}^{t} xp(x) dx + \int_{t}^{\infty} xp(x)dx\\ &\geq \int_{t}^{\infty} xp(x)dx\\ &\geq t\int_{t}^{\infty} p(x)dx\\ &= tP[X \geq t]\\ \end{align*}\] \[P[X \geq t] \leq \dfrac{\mathrm{E}[x]}{t}\] Chebyshev's Inequality: \(P[|x - \mu| \geq t] \leq \dfrac{\mathrm{var}[x]}{t^2}\)
Proof: \[\begin{align*} \sigma^2 &= \mathrm{var}[x]\\ &= \int_{-\infty}^{\infty} (x - \mu)^2p(x) dx\\ &\geq \int_{-\infty}^{\mu - t} (x-\mu)^2p(x) dx + \int_{\mu + t}^{\infty} (x-\mu)^2p(x)dx\\ &\geq t^2\int_{-\infty}^{\mu - t} p(x) dx + t^2\int_{\mu + t}^{\infty} p(x)dx\\ &= t^2 P[X \leq \mu - t\; \mathrm{or}\; X \geq \mu + t]\\ &= t^2 P[|X - \mu| \geq t]\\ \end{align*}\] \[P[X \geq t] \leq \dfrac{\mathrm{var}[x]}{t^2}\] Both the bounds described above are sharp and cannot be improved further. Similar technique can be extended to any integer \(k \geq 1\) to obtain the following result \[P[X \geq t] \leq \dfrac{\mathrm{E}[|x - \mu|^k]}{t^k}\] Next, we will utilize moment generating functions to prove another useful bound known as the Chernoff bound. Suppose that the random variable \(X\) has a moment generating function in a neighborhood of zero, meaning that that there is some constant \(b > 0\) such that the function \(\phi(\lambda) = \mathrm{E}[e^{\lambda(X−\mu)}]\) exists \(\forall \lambda \leq |b|\). Using markov's inequality we obtain \[P[(X-\mu) \geq t] = P[e^{(X-\mu)} \geq e^t] \leq \dfrac{\mathrm{E}[e^{\lambda(X-\mu)}]}{e^{\lambda t}}\] Optimization over \(\lambda\) yields the Chernoff's bound \[\log P[e^{\lambda (x - \mu)} \geq e^{\lambda t}] \leq -\sup_{\lambda \in [0, b]} \{ \lambda t - \log \mathrm{E}[e^{\lambda(x - \mu)}]\} \]

Gaussian tail

Now that we have enough context, we can proceed to our primary objective at hand. In obtaining the gaussian tail bound we would be leveraging the above inequalities quite a lot. Let us consider a random variable \(X \sim N(\mu, \sigma^2)\). We calculate the expectation of the moment generating function as \[\mathrm{E}[e^{\lambda x}] = e^{\mu \lambda + \frac{\sigma^2 \lambda^2}{2}} \] Substituting the above result in the Chernoff bound we get \[\begin{align*} \log P[e^{\lambda (x - \mu)} &\geq e^{\lambda t}] \leq -\sup_{\lambda \in [0, b]} \{ \lambda t - \log \mathrm{E}[e^{\frac{\sigma^2 \lambda^2}{2}}]\\ &= -\sup_{\lambda \in [0, b]} \{ \lambda t - \frac{\sigma^2 \lambda^2}{2}\} \\ \end{align*}\] Differentiating the expression inside, we get \(\lambda = \dfrac{t}{\sigma^2}\) for which the above term is maximized. Substituting the \(\lambda\) value in the above equation \[\begin{align*} \log P[e^{\lambda (x - \mu)}] &\geq -\frac{t^2}{2\sigma^2} \end{align*}\] \[P[e^{\lambda (x - \mu)}] \geq e^{-\frac{t^2}{2\sigma^2}}\] The above bound is tight upto certain level of polynomial factor. Okay, we have reached the end of this blog, these bounds are quite useful in understanding many similar bounds for sub-gaussian random variables, which I will hopefully cover in a later blog.

References