T-Test
5 minute read
In this section, we will understand Student’s T-Test.
T-Test:
It is a statistical test that is used to determine whether the sample mean is equal to a hypothesized value or
is there a significant difference between the sample means of 2 groups.
- It is a parametric test, since it assumes data to be approximately normally distributed.
- Appropriate when:
- sample size n < 30.
- population standard deviation \(\sigma\) is unknown.
- It is based on Student’s t-distribution.
Student’s t-distribution:
It is a continuous probability distribution that is a symmetrical, bell-shaped curve similar
to the normal distribution but with heavier tails.
- Shape of the curve or mass in tail is controlled by degrees of freedom.

There are 3 types of T-Test:
- 1-Sample T-Test: Test if sample mean differs from hypothesized value.
- 2-Sample T-Test: Test whether there is a significant difference between the means of two independent groups.
- Paired T-Test: Test whether 2 related samples differ, e.g., before and after.
Degrees of Freedom (\(\nu\))
It represents the number of independent pieces of information available in the sample to estimate the variability in the data.Generally speaking, it represents the number of independent values that are free to vary in a dataset when estimating a parameter.
e.g.: If we have k observations and their sum = 50.
The sum of (k-1) terms can be anything, but the kth term is fixed at 50 - (sum of other (k-1) terms).
So, we have only (k-1) terms that can change independently, therefore, the DOF(\(\nu\)) = k-1.
1-Sample T-Test:
It is used to test whether the sample mean is equal to a known/hypothesized value.
Test statistic (t):
where,
\(\bar{x}\): sample mean
\(\mu\): hypothesized value
\(s\): sample standard deviation
\(n\): sample size
\(\nu = n-1 \): degrees of freedom
Tester ran the test 20 times and found the average API repsonse time to be 115 ms, with a standard deviation of 25 ms.
Is the developer’s claim valid?
Let’s verify developer’s claim using the tester’s test results using 1 sample t-test.
Null hypothesis: \(H_0\) = The average API response time is 100 ms, i.e, \(\bar{x} = \mu\).
Alternative hypothesis: \(H_a\) = The average API response time > 100 ms, i.e, \(\bar{x} > \mu\) => right tailed test.
Hypothesized mean \(\mu\) = 100 ms
Sample mean \(\bar{x}\) = 115 ms
Sample standard deviation \(s\) = 25 ms
Sample size \(n\) = 20
Degrees of freedom \(\nu\) = 19
\( t_{obs} = \frac{\bar{x} - \mu}{s/\sqrt{n}}\) = \(\frac{115 - 100}{25/\sqrt{20}}\)
= \(\frac{15\sqrt{20}}{25} = \frac{3\sqrt{20}}{5} \approx 2.68\)
Let significance level \(\alpha\) = 5% =0.05.
Critical value \(t_{0.05}\) = 1.729
Important: Find the value of \(t_{\alpha}\) in T-table

Since \(t_{obs}\) > \(t_{0.05}\), we reject the null hypothesis.
And, accept the alternative hypothesis that the API response time is significantly > 100 ms.
Hence, the developer’s claim is NOT valid.
2-Sample T-Test:
It is used to determine whether there is a significant difference between the means of two independent groups.
There are 2 types of 2-sample t-test:
- Unequal Variance
- Equal Variance
Unequal Variance:
In this case, the variance of 2 independent groups is not equal.
Also called, Welch’s t-test.
Test statistic (t):
Equal Variance:
In this case, both samples come from equal or approximately equal variance.
Test statistic (t):
Here, degrees of freedom (for equal variance) \(\nu\) = \(n_1 + n_2 - 2\).
\(\bar{x}\): sample mean
\(s\): sample standard deviation
\(n\): sample size
\(\nu\): degrees of freedom
The AI team wants to validate whether the new ML model accuracy is better than the existing model’s accuracy.
Below is the data for the existing model and the new model.
| New Model (A) | Existing Model (B) | |
|---|---|---|
| Sample size (n) | 24 | 18 |
| Sample mean (\(\bar{x}\)) | 91% | 88% |
| Sample std. dev. (s) | 4% | 3% |
Given that the variance of accuracy scores of new and existing models are almost same.
Now, let’s follow our hypothesis testing framework.
Null hypothesis: \(H_0\): The accuracy of new model is same as the accuracy of existing model.
Alternative hypothesis: \(H_a\): The new model’s accuracy is better/greater than the existing model’s accuracy => right tailed test
Let’s solve this using 2 sample T-Test, since the sample size n < 30.
Since the variance of 2 sample are almost equal then we can use the pooled variance method.
Next let’s compute the test statistic, under null hypothesis.
DOF \(\nu\) = \(24+18-2\) = 42 - 2 = 40
Let significance level \(\alpha\) = 5% =0.05.
Critical value \(t_{0.05}\) = 1.684
Important: Find the value of \(t_{\alpha}\) in T-table

Since \(t_{obs}\) > \(t_{0.05}\), we reject the null hypothesis.
And, accept the alternative hypothesis that the new model has better accuracy than the existing model.
End of Section