Markov's Inequality

Markov’s, Chebyshev’s Inequality & Chernoff Bound

3 minute read

In this section, we will understand Markov’s, Chebyshev’s Inequality & Chernoff Bound.

📘

Markov’s Inequality:
It gives an upper bound for the probability that a non-negative random variable will NOT exceed, based on its expected value.

\[ P(X \ge a) \le \frac{E[X]}{a} \]

Note: It gives a very loose upper bound.

💡 A restaurant, on an average, expects to serve 50 customers per hour.
What is the probability that the restaurant will serve more than 200 customers in the next hour?

\[ P(X \ge 200) \le \frac{E[X]}{200} = \frac{50}{200} = 0.25 \]

Hence, a 25% chance of serving more than 200 customers.

💡 Consider a test where the average score is 70/100 marks.
What is the probability that a randomly selected student gets a score of 90 marks or more?

\[ P(X \ge a) \le \frac{E[X]}{a} \\[10pt] P(X \ge 90) \le \frac{70}{90} \approx 0.78 \]

Hence, there is a 78% chance that a randomly selected student gets a score of 90 marks or more.

📘

Chebyshev’s Inequality:
It states that the probability of a random variable deviating from its mean is small if its variance is small.

It is a more powerful version of Markov’s Inequality.
It uses variance of the distribution in addition to the expected value or mean.
Also, it does NOT assume the random variable to be non-negative.
It uses more information about the data i.e mean and variance.

\[ P(|X - E[X]| \ge k) \le \frac{E[(X - E[X])^2]}{k^2} \text{ ; k > 0 } \\[10pt] \text{ We know that: } Var[X] = E[(X - E[X])^2] \\[10pt] => P(\big|X - E[X]\big| \ge k) \le \frac{Var[X]}{k^2} \]

Note: It gives a tighter upper bound than Markov’s Inequality.

💡 Consider a test where the average score is 70/100 marks.
What is the probability that a randomly selected student gets a score of 90 marks or more?
Given the standard deviation of the test score is 10 marks.

Given, the standard deviation of the test score \(\sigma\) = 10 marks.
=> Variance = \(\sigma^2\) = 100

\[ P(\big|X - E[X]\big| \ge k) \le \frac{Var[X]}{k^2} \\[10pt] E[X] = 70, Var[X] = 100 \\[10pt] P(X \ge 90) \le P(\big|X - 70\big| \ge 20) \\[10pt] P(\big|X - 70\big| \ge 20) \le \frac{100}{20^2} = \frac{1}{4} = 0.25 \]

Hence, Chebyshev’s Inequality gives a far tighter upper bound of 25% than Markov’s Inequality of 78%(approx).

📘

Chernoff Bound:
It is an upper bound on the probability that a random variable deviates from its expected value.
It’s an exponentially decreasing bound on the “tail” of a random variable’s distribution, which can be calculated using its moment generating function.

It is used for sum or average of independent random variables (not necessarily identically distributed).
It provides exponentially tighter bounds, better than Chebyshev’s Inequality’s quadratic decay.
It uses all moments to capture the full shape of the distribution, using the moment generating function(MGF).

\[ P(X \ge c) \le e^{-tc}E[e^{tX}] , \forall t>0\\[10pt] \text{ where } E[e^{tX}] \text{ is the Moment Generating Function of } X \]

Proof:

\[ P(X \ge c) = P(e^{tX} \ge e^{tc}), \text{ provided } t>0 \\[10pt] \text{ using Markov's Inequality: } \\[10pt] P(e^{tX} \ge e^{tc}) \le \frac{E[e^{tX}]}{e^{tc}} =e^{-tc}E[e^{tX}] \\[10pt] \]

For the sum of ’n’ independent random variables,

\[ P(S_n \ge c) \le e^{-tc}(M_x(t))^n \\[10pt] \text{ where } M_x(t) \text{ is the Moment Generating Function of } X \\[10pt] \]

Note: Used to compute how far the sum of independent random variables deviate from their expected value.

Watch Video

End of Section