What’s Wrong With R-Squared (And How to Fix It) | by Samuele Mazzanti | Aug, 2024


R-Squared is one of the most popular metrics to evaluate regression models. It’s taught in any statistics class and it’s one of the metrics implemented in Scikit-learn.

However, some doubts have been raised about the reliability of this metric. In the notes for his course at Carnegie Mellon University, Professor Cosma Shalizi claims that R-Squared is useless.

So, should we completely dismiss R-Squared?

I don’t think so.

I admit that this metric has one major flaw, but I also think we shouldn’t lose sight of the positives. In this article, I will explain what is wrong with R-Squared, and suggest a modification that makes it fully reliable.

To grasp what is the problem with R-Squared, we first need to understand its meaning. And I mean the deeper meaning, not the sloppy definitions that can be found in most resources.

Let’s start with an example. Suppose we have a predictive model (“model A”) designed to forecast the selling price of a house.

Imagine that our test set consists of four houses. We can visually check the…

Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here