Gaussian tail bounds
Before we even start with gaussian distributions and figuring out their bounds, we will go over
some basic probability theory inequalities. In this section we start off with the Markov's
and Chebyshev's Inequality.
Notation: \(P[\cdot]\) denotes the Cumulative Density Function (CDF) and \(p(\cdot)\) denotes
the PDF
Markov's Inequality: \(P[x \geq t] \leq \dfrac{\mathrm{E}[x]}{t}\)
Proof: \[\begin{align*} \mathrm{E}(X) &= \int_{-\infty}^{\infty} xp(x) dx\\
&= \int_{-\infty}^{t} xp(x) dx + \int_{t}^{\infty} xp(x)dx\\
&\geq \int_{t}^{\infty} xp(x)dx\\
&\geq t\int_{t}^{\infty} p(x)dx\\
&= tP[X \geq t]\\ \end{align*}\]
\[P[X \geq t] \leq \dfrac{\mathrm{E}[x]}{t}\]
Chebyshev's Inequality: \(P[|x - \mu| \geq t] \leq \dfrac{\mathrm{var}[x]}{t^2}\)
Proof: \[\begin{align*} \sigma^2 &= \mathrm{var}[x]\\
&= \int_{-\infty}^{\infty} (x - \mu)^2p(x) dx\\
&\geq \int_{-\infty}^{\mu - t} (x-\mu)^2p(x) dx + \int_{\mu + t}^{\infty} (x-\mu)^2p(x)dx\\
&\geq t^2\int_{-\infty}^{\mu - t} p(x) dx + t^2\int_{\mu + t}^{\infty} p(x)dx\\
&= t^2 P[X \leq \mu - t\; \mathrm{or}\; X \geq \mu + t]\\
&= t^2 P[|X - \mu| \geq t]\\ \end{align*}\]
\[P[X \geq t] \leq \dfrac{\mathrm{var}[x]}{t^2}\]
Both the bounds described above are sharp and cannot be improved further. Similar technique
can be extended to any integer \(k \geq 1\) to obtain the following result
\[P[X \geq t] \leq \dfrac{\mathrm{E}[|x - \mu|^k]}{t^k}\]
Next, we will utilize
moment
generating functions to prove another useful bound known as the
Chernoff bound.
Suppose that the random variable \(X\) has a moment generating
function in a neighborhood of zero, meaning that that there is some constant \(b > 0\)
such that the function \(\phi(\lambda) = \mathrm{E}[e^{\lambda(X−\mu)}]\) exists \(\forall \lambda \leq |b|\).
Using markov's inequality we obtain
\[P[(X-\mu) \geq t] = P[e^{(X-\mu)} \geq e^t] \leq \dfrac{\mathrm{E}[e^{\lambda(X-\mu)}]}{e^{\lambda t}}\]
Optimization over \(\lambda\) yields the Chernoff's bound
\[\log P[e^{\lambda (x - \mu)} \geq e^{\lambda t}] \leq -\sup_{\lambda \in [0, b]} \{
\lambda t - \log \mathrm{E}[e^{\lambda(x - \mu)}]\} \]
Gaussian tail
Now that we have enough context, we can proceed to our primary objective at hand. In obtaining the
gaussian tail bound we would be leveraging the above inequalities quite a lot. Let us consider a random
variable \(X \sim N(\mu, \sigma^2)\). We calculate the expectation of the moment generating function as
\[\mathrm{E}[e^{\lambda x}] = e^{\mu \lambda + \frac{\sigma^2 \lambda^2}{2}} \]
Substituting the above result in the Chernoff bound we get
\[\begin{align*} \log P[e^{\lambda (x - \mu)} &\geq e^{\lambda t}] \leq -\sup_{\lambda \in [0, b]} \{
\lambda t - \log \mathrm{E}[e^{\frac{\sigma^2 \lambda^2}{2}}]\\
&= -\sup_{\lambda \in [0, b]} \{
\lambda t - \frac{\sigma^2 \lambda^2}{2}\} \\ \end{align*}\]
Differentiating the expression inside, we get \(\lambda = \dfrac{t}{\sigma^2}\) for which the above term is
maximized. Substituting the \(\lambda\) value in the above equation
\[\begin{align*} \log P[e^{\lambda (x - \mu)}] &\geq -\frac{t^2}{2\sigma^2} \end{align*}\]
\[P[e^{\lambda (x - \mu)}] \geq e^{-\frac{t^2}{2\sigma^2}}\]
The above bound is tight upto certain level of polynomial factor. Okay, we have reached the end of this blog, these
bounds are quite useful in understanding many similar bounds for sub-gaussian random variables, which I will
hopefully cover in a later blog.
References