Ensemble, Bagging and Boosting

Ensembling is a method of combining more than one models to generate a final output.(Reference [1])

There are two ways of doing that:

  1. Bagging
  2. Boosting

 

Bagging Boosting
  • We take subset of data and train different models
  • Example Random forest
    • It takes subset of data as well as subset of features
  • Pros of random forest
    • Handles high dimensions
    • Handles missing values
  • Cons of random forest
    • It won’t give precise value regression because final value is mean from subset tress
    • None the less people are using it for regression depending upon domain
  • We train different model on with same data. Each sample is assigned different weight in each iteration
  • Example AdaBoost, XgBoost
  • Pros of XgBoost
    • Supports different loss function
    • Works well with interactions
  • Cons of XbBoost
    • Prone to overfitting
    • Tuning of hyper parameters is critical

 

Pros-cons of bagging vs boosting:

  • Bagging is easy to parallelize and hence training is faster
  • Boosting is more efficient for fixed no of iterations (classifiers)

 

AdaBoost vs XgBoost

Reference : [2]

ada_vs_xg

Quote from Tianqi Chen, one of the developers of XGBoost:

Adaboost and gradboosting [XGBoost] are two different ways to derive boosters. Both are generic. I like gradboosting better because it works for generic loss functions, while adaboost is derived mainly for classification with exponential loss.

 

Reference :

[1] https://towardsdatascience.com/decision-tree-ensembles-bagging-and-boosting-266a8ba60fd9

[2] https://www.packtpub.com/mapt/book/big_data_and_business_intelligence/9781788295758/4/ch04lvl1sec34/comparison-between-adaboosting-versus-gradient-boosting

 

Leave a comment