A Case for Bagging and Boosting as Data Scientists’ Best Friends | by Farzad Nobar | Dec, 2024


Leveraging wisdom of the crowd in ML models.

Towards Data Science
Photo by Luca Upper on Unsplash

In recent years, we take resources such as Wikipedia or Reddit for granted — these resources rely on the collective knowledge of individual contributors to serve us with mostly accurate information, which is sometimes called the “wisdom of the crowd”. The idea is that the collective decision can be more accurate than any individual’s judgement, since we can each have our own implicit biases and/or lack of knowledge, resulting in some level of error in our judgement. Collectively, these errors might be offset by each other — for example, we can compensate for someone else’s lack of knowledge/expertise in one area, while they would make up for ours in other areas. Applying this idea to machine learning results in “ensemble” methods.

At a very high level, we train machine learning models in order to make predictions about the future. In other words, we provide the models with the training data with the hope that model can make good predictions about the future. But what if we could train several machine learning models and then somehow aggregate their opinions about the predictions? It turns out, this can be a very useful approach, which is widely used in the industry.

Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here