AdaBoost
2 minute read
💡Works by increasing 📈 the weight 🏋️♀️ of misclassified data points after each iteration, forcing the next weak learner to ‘pay more attention’🚨 to the difficult cases.
⭐️ Commonly used for classification.
👉Weak learners are typically ‘Decision Stumps’, i.e, decision trees🌲with a depth of only one (1 split, 2 leaves 🍃).

- Assign an equal weight 🏋️♀️to every data point; \(w_i = 1/n\), where ’n’=number of samples.
- Build a decision stump that minimizes the weighted classification error.
- Calculate total error; \(E_m = \Sigma w_i\).
- Determine ‘amount of say’, i.e, the weight 🏋️♀️ of each stump in final decision.
\[\alpha_m = \frac{1}{2}ln\left( \frac{1-E_m}{E_m} \right)\]
- Low error results in a high positive \(\alpha\) (high influence).
- 50% error (random guessing) results in an \(\alpha = 0\) (no influence).
- Update sample weights 🏋️♀️.
- Misclassified samples: Weight 🏋️♀️ increases by \(e^{\alpha_m}\).
- Correctly classified samples: Weight 🏋️♀️ decreases by \(e^{-\alpha_m}\).
- Normalization: All new weights 🏋️♀️ are divided by their total sum so they add up back to 1.
- Iterate for a specified number of estimators (n_estimators).
👉 To classify a new data point, every stump makes a prediction (+1 or -1).
These are multiplied by their respective ‘amount of say’ \(\alpha_m\) and summed.
\[H(x)=sign\sum_{m=1}^{M}\alpha_{m}⋅h_{m}(x)\]👉 If the total weighted 🏋️♀️ sum is positive, the final class is +1; otherwise -1.
Note: Sensitive to outliers; Because AdaBoost aggressively increases weights 🏋️♀️ on misclassified points, it may ‘over-focus’ on noisy outliers, hurting performance.
End of Section