Diving into the F-test for nested models with algorithms, examples and code
When analyzing data, one often needs to compare two regression models to determine which one fits best to a piece of data. Often, one model is a simpler version of a more complex model that includes additional parameters. However, more parameters do not always guarantee that a more complex model is actually better, as they could simply overfit the data.
To determine whether the added complexity is statistically significant, we can use what’s called the F-test for nested models. This statistical technique evaluates whether the reduction in the Residual Sum of Squares (RSS) due to the additional parameters is meaningful or just due to chance.
In this article I explain the F-test for nested models and then I present a step-by-step algorithm, demonstrate its implementation using pseudocode, and provide Matlab code that you can run right away or re-implement in your favorite system (here I chose Matlab because it gave me quick access to statistics and fitting functions, on which I didn’t want to spend time). Throughout the article we will see examples of the F-test for nested models at work in a couple of settings including some examples I built into the example Matlab code.