Before we learn what Is Bagging, and how it is used, we need to understand one very important topic – Bootstrapping. We assume that you have some kind of familiarity with Decision trees and Random Forests, as it is very necessary to understand them before we start with this topic.

What is Ensemble Learning?

Ensemble methods – is a technique or a machine learning paradigm in which multiple models, called weak learners are combined together to form a more robust and accurate model. When the weak learners are combined correctly, they give an accurate result and a more robust model. Most of the time these weak learners don’t perform well themselves because they have high bias or have too much variance. So, the Ensemble technique does the smarter technique by combining several of these weak models to try and reduce the bias and variance and give rise to a strong model (called ensemble model) that gets better results.

One of the ways of aggregating models is called the Bagging or Bootstrapping model. Bagging is an ensemble technique that assumes all weak learners have homogenous datasets, learns from the weak learners individually in parallel, and combines them and finds the average of all techniques to predict the result. The bagging method is mostly used to reduce the variance of a decision tree classifier. Each of the subset retrieved from the dataset is used to train weak learners in parallel and then finally combined to get the result.

You can find my other article on Boosting in DT for further reading!

So before we jump into Bagging, let us understand what Bootstrapping is

Bootstrapping is a statistical method for estimating quantity from a data sample, such as taking out the mean or a standard deviation. It estimates quantity on some dataset by sampling it with a replacement in the training dataset. We randomly pick a sample from the training dataset, train the model on that sample, and then replace the sample with another sample from the training set.

This ensures the data in the training set appears in more than one training sample, thus reducing the variance in the model. This process of sampling is repeated many times, and then the output is averaged to get the final result. It is recommended to use 20-30% of the size of the dataset into making a sample and the process should be repeated 20 times. We don’t have to implement Bootstrap algorithm on our own, as Sci-kit learn library can be used with the resample() function.


How to implement the Bagging technique?

We create multiple bootstrap samples so each sample acts as a complete individual dataset from the training dataset. After this, we fit a weak learner for all the samples and combine them to form a strong learner by averaging the results. This process reduces the variance in the model. Bagging is the application of Bootstrap procedure to a high variance machine Learning algorithms, usually decision trees. Decision trees have a lot of similarity and co-relation in their predictions. Bagging methods ensure that the overfitting of the model is reduced, and it handles higher-level dimensionality very well. It is also accurate for missing data in the dataset.

Syntax for Bagging algorithm:-


You can learn more about Bagging algorithm here


So as we see,  bagging methods constitute a very simple way to improve the model, without needing to change the underlying base algorithm. As they provide a way to reduce overfitting, bagging methods work best with strong and complex models (e.g., fully developed decision trees), in contrast with boosting methods that usually work best with weak models (e.g., shallow decision trees).

References used –