Markov's Inequality

Markov’s, Chebyshev’s Inequality & Chernoff Bound

In this section, we will understand Markov’s, Chebyshev’s Inequality & Chernoff Bound.


📘

Markov’s Inequality:
It gives an upper bound for the probability that a non-negative random variable will NOT exceed, based on its expected value.

\[ P(X \ge a) \le \frac{E[X]}{a} \]

Note: It gives a very loose upper bound.


💡 A restaurant, on an average, expects to serve 50 customers per hour.
What is the probability that the restaurant will serve more than 200 customers in the next hour?

\[ P(X \ge 200) \le \frac{E[X]}{200} = \frac{50}{200} = 0.25 \]

Hence, a 25% chance of serving more than 200 customers.


💡 Consider a test where the average score is 70/100 marks.
What is the probability that a randomly selected student gets a score of 90 marks or more?

\[ P(X \ge a) \le \frac{E[X]}{a} \\[10pt] P(X \ge 90) \le \frac{70}{90} \approx 0.78 \]

Hence, there is a 78% chance that a randomly selected student gets a score of 90 marks or more.



📘

Chebyshev’s Inequality:
It states that the probability of a random variable deviating from its mean is small if its variance is small.

  • It is a more powerful version of Markov’s Inequality.
  • It uses variance of the distribution in addition to the expected value or mean.
  • Also, it does NOT assume the random variable to be non-negative.
  • It uses more information about the data i.e mean and variance.
\[ P(|X - E[X]| \ge k) \le \frac{E[(X - E[X])^2]}{k^2} \text{ ; k > 0 } \\[10pt] \text{ We know that: } Var[X] = E[(X - E[X])^2] \\[10pt] => P(\big|X - E[X]\big| \ge k) \le \frac{Var[X]}{k^2} \]

Note: It gives a tighter upper bound than Markov’s Inequality.


💡 Consider a test where the average score is 70/100 marks.
What is the probability that a randomly selected student gets a score of 90 marks or more?
Given the standard deviation of the test score is 10 marks.

Given, the standard deviation of the test score \(\sigma\) = 10 marks.
=> Variance = \(\sigma^2\) = 100

\[ P(\big|X - E[X]\big| \ge k) \le \frac{Var[X]}{k^2} \\[10pt] E[X] = 70, Var[X] = 100 \\[10pt] P(X \ge 90) \le P(\big|X - 70\big| \ge 20) \\[10pt] P(\big|X - 70\big| \ge 20) \le \frac{100}{20^2} = \frac{1}{4} = 0.25 \]

Hence, Chebyshev’s Inequality gives a far tighter upper bound of 25% than Markov’s Inequality of 78%(approx).



📘

Chernoff Bound:
It is an upper bound on the probability that a random variable deviates from its expected value.
It’s an exponentially decreasing bound on the “tail” of a random variable’s distribution, which can be calculated using its moment generating function.

  • It is used for sum or average of independent random variables (not necessarily identically distributed).
  • It provides exponentially tighter bounds, better than Chebyshev’s Inequality’s quadratic decay.
  • It uses all moments to capture the full shape of the distribution, using the moment generating function(MGF).
\[ P(X \ge c) \le e^{-tc}E[e^{tX}] , \forall t>0\\[10pt] \text{ where } E[e^{tX}] \text{ is the Moment Generating Function of } X \]


Proof:

\[ P(X \ge c) = P(e^{tX} \ge e^{tc}), \text{ provided } t>0 \\[10pt] \text{ using Markov's Inequality: } \\[10pt] P(e^{tX} \ge e^{tc}) \le \frac{E[e^{tX}]}{e^{tc}} =e^{-tc}E[e^{tX}] \\[10pt] \]


For the sum of ’n’ independent random variables,

\[ P(S_n \ge c) \le e^{-tc}(M_x(t))^n \\[10pt] \text{ where } M_x(t) \text{ is the Moment Generating Function of } X \\[10pt] \]

Note: Used to compute how far the sum of independent random variables deviate from their expected value.



End of Section