Ensembling is a method of combining more than one models to generate a final output.(Reference [1])
There are two ways of doing that:
- Bagging
- Boosting
| Bagging |
Boosting |
- We take subset of data and train different models
- Example Random forest
- It takes subset of data as well as subset of features
- Pros of random forest
- Handles high dimensions
- Handles missing values
- Cons of random forest
- It won’t give precise value regression because final value is mean from subset tress
- None the less people are using it for regression depending upon domain
|
- We train different model on with same data. Each sample is assigned different weight in each iteration
- Example AdaBoost, XgBoost
- Pros of XgBoost
- Supports different loss function
- Works well with interactions
- Cons of XbBoost
- Prone to overfitting
- Tuning of hyper parameters is critical
|
Pros-cons of bagging vs boosting:
- Bagging is easy to parallelize and hence training is faster
- Boosting is more efficient for fixed no of iterations (classifiers)
AdaBoost vs XgBoost
Reference : [2]

Quote from Tianqi Chen, one of the developers of XGBoost:
Adaboost and gradboosting [XGBoost] are two different ways to derive boosters. Both are generic. I like gradboosting better because it works for generic loss functions, while adaboost is derived mainly for classification with exponential loss.
Reference :
[1] https://towardsdatascience.com/decision-tree-ensembles-bagging-and-boosting-266a8ba60fd9
[2] https://www.packtpub.com/mapt/book/big_data_and_business_intelligence/9781788295758/4/ch04lvl1sec34/comparison-between-adaboosting-versus-gradient-boosting