Introduction to Probability
5 minute read
This section will introduce you to basic terminologies and definitions used in probability for AI & ML.
the fundamental language to understand, express and deal with this uncertainty.
For example:
- Toss a fair coin, \(P(H) = P(T) = 1/2\)
- Roll a die, \(P(1) = P(2) = P(3) = P(4) = P(5) = P(6) = 1/6\)
- Email classifier, \(P(spam) = 0.95 ,~ P(not ~ spam) = 0.05\)
Numerical measure of chance or likelihood that an event will occur.
\(P=0\): Highly unlikely
\(P=1\): Almost certain
Set of all possible outcomes of an experiment.
Symbol: \(\Omega\)
- Toss a fair coin, sample space: \(\Omega = \{H,T\}\)
- Roll a die, sample space: \(\Omega = \{1,2,3,4,5,6\}\)
- Choose a real number \(x\) from the interval \([2,3]\), sample space: \(\Omega = [2,3]\); sample size = \(\infin\)
Note: There can be infinitely many points between 2 and 3, e.g: 2.21, 2.211, 2.2111, 2.21111, … - Randomly put a point in a rectangular region; sample size = \(\infin\)
Note: There can be infinitely many points in any rectangular region.
\(P(\Omega) = 1\)
An outcome of an experiment. A subset of all possible outcomes.
A,B,…βΞ©
- Toss a fair coin, set of possible outcomes: \(\{H,T\}\)
- Roll a die, set of possible outcomes: \(\{1,2,3,4,5,6\}\)
- Roll a die, event \(A = \{1,2\} => P(A) = 2/6 = 1/3\)
- Email classifier, set of possible outcomes: \(\{spam,not ~spam\}\).
Number of potential outcomes from an experiment is countable, distinct, or can be listed in a sequence, even if infinite i.e countably infinite.
- Toss a fair coin, possible outcomes: \(\Omega = \{H,T\}\)
- Roll a die, possible outcomes: \(\Omega = \{1,2,3,4,5,6\}\)
- Choose a real number \(x\) from the interval \([2,3]\) with decimal precision, sample space: \(\Omega = [2,3]\).
Note: There are 99 real numbers between 2 and 3 with 2 decimal precision i.e from 2.01 to 2.99. - Number of cars passing a specific traffic signal in 1 hour.
Potential outcomes from an experiment can take any value within a given range or interval, representing an uncountably infinite set of possibilities.
- A line segment between 2 and 3 - forms a continuum.
- Randomly put a point in a rectangular region.
graph TD
A[Sample Space] --> |Discrete| B(Finite)
A --> C(Infinite)
C --> |Discrete| D(Countable)
C --> |Continuous| E(Uncountable)Two or more events that cannot happen at the same time.
No overlapping or common outcomes.
If one event occurs, then the other event does NOT occur.
For example:
- Roll a die, sample space: \(\Omega = \{1,2,3,4,5,6\}\)
Odd outcome = \(A = \{1,3,5\}\)
Even outcome = \(B = \{2,4,6\}\) are mutually exclusive.
\(P(A \cap B) = 0\)
Since, \(P(A \cup B) = P(A) + P(B) - (P(A \cap B)\)
Therefore, \(P(A \cup B) = P(A) + P(B)\)
Note: If we know that event \(A\) has occurred, then we can say for sure that the event \(B\) did NOT occur.
Two events are independent if the occurrence of one event does NOT impact the outcome of the other event.
For example:
- Roll a die twice , sample space: \(\Omega = \{1,2,3,4,5,6\}\)
Odd number in 1st throw = \(A = \{1,3,5\}\)
Odd number in 2nd throw = \(B = \{1,3,5\}\)
Note: A and B are independent because whether we get an odd number in 1st roll has NO impact of getting an odd number in second roll.
\(P(A \cap B) = P(A)*P(B)\)
Note: If we know that event \(A\) has occurred, then that gives us NO new information about the event \(B\).
Let’s understand this answer with an example.
Probability of choosing exactly one point on the number line or a real number, say 2.5,
from the interval \([2,3]\) is almost = 0, because there are infinitely many points between 2 and 3.
Also, we can NOT say that choosing exactly 2.5 is impossible, because it exists there on the number line.
But, for all practical purposes, \(P(2.5) = 0\).
Therefore, we say that \(P=0\) means “Highly Unlikely” and NOT “Impossible”.
Extending this line of reasoning, we can say that probability of NOT choosing 2.5, \(P(!2.5) = 1\).
Theoretically yes, because there are infinitely many points between 2 and 3.
But, we cannot say for sure that we cannot choose 2.5 exactly.
There is some probability of choosing 2.5, but it is very small.
Therefore, we say that \(P=1\) means “Almost Sure” and NOT “Certain”.
Here, in this case we can say that \(P(7)=0\) and that means Impossible.
Similarly, we can say that \(P(get ~any ~number ~between ~1 ~and ~6)=1\) and \(P=1 => \) Certain.
End of Introduction