Boosting

less than a minute

In Bagging we trained multiple strong(over-fit, high variance) models (in parallel) and then averaged them out to reduce variance.
Similarly, we can train many weak(under-fit, high bias) models sequentially, such that, each new model corrects the errors of the previous ones to reduce bias.

An ensemble learning approach where multiple ‘weak learners’ (typically simple models like shallow decision trees or ‘stumps’) are sequentially combined to create a single strong predictive model.

The core principle is that each subsequent model focuses on correcting the errors made by its predecessors.

Boosting generally achieves better predictive performance because it actively reduces bias by learning from ‘past mistakes’, making it ideal for achieving state-of-the-art ️ results.

AdaBoost(Adaptive Boosting)
Gradient Boosting Machine (GBM)
- XGBoost
- LightGBM (Microsoft)
- CatBoost (Yandex)

Video Boosting Ensemble Technique | Why is Boosting Better? | Explained with Examples

Previous: Extra Trees Next: AdaBoost