Out-of-bag Evaluation. While working on the Kaggle “House… | by Subhasmita Sahoo | Jun, 2024

While working on the Kaggle “House Prices” challenge, I came across this neat metric called “out-of-bag evaluation” (OOB). It’s a way to check the accuracy of a Random Forest model without needing extra data, similar to cross-validation.

Idea: To begin with, Random forest is a collection of Decision trees. When each “tree” in the Random Forest is built, only a portion of the training data is used. This leaves the rest of the training data for mini testing of that specific tree. The table below illustrates how this works:

https://developers.google.com/machine-learning/decision-forests/out-of-bag

In the above example, we have 6 examples in training data, with which a Random Forest with 3 trees is built. Each tree is made using 6 examples, where each example is sampled from the training data with replacement.

Tree 1: Uses all houses except house #3, so we can test the tree on house #3.
Tree 2: Uses all houses except house #2, #4, and #6, so we can test the tree on house #2, #4, and #6.
Tree 3: Uses all houses except house #1 and #5, so we can test the tree on house #1 and #5.

This way, each tree gets evaluated on data it hasn’t seen before, giving us a reliable estimate of how well our model will do in the real world.

Hope this article helped! Any relevant feedback is appreciated!

Out-of-bag Evaluation. While working on the Kaggle “House… | by Subhasmita Sahoo | Jun, 2024

Recent Articles

Virtual Adrian Revisited as meGPT | by adrian cockcroft | May, 2025

The VPN Everyone Trusts Just Slashed Prices – 73% Off Now

InterVision accelerates AI development using AWS LLM League and Amazon SageMaker AI

FBI warns that end of life devices are being actively targeted by threat actors

Time Series Forecasting Made Simple (Part 2): Customizing Baseline Models

Related Stories

Leave A Reply Cancel reply