Data Scaling 101: Standardization and Min-Max Scaling Explained | by Haden Pelletier | Aug, 2024


When to use MinMaxScaler vs StandardScaler vs something else

Towards Data Science
Photo by Sven Mieke on Unsplash

What is scaling?

When you first load a dataset into your Python script or notebook, and take a look at your numerical features, you’ll likely notice that they are all on different scales.

This means that each column or feature will have varying ranges. For example, one feature may have values ranging from 0 to 1, while another can have values ranging from 1000 to 10000.

Take the Wine Quality dataset from UCI Machine Learning Repository (CC by 4.0 License) for example.

A few features from the UCI Wine Quality dataset. Image by author

Scaling is essentially the process of bringing all the features closer to a similar or same range or scale, such as transforming them so all values are between 0 and 1.

When (and why) you need to scale

There are a few reasons why scaling features before fitting/training a machine learning model is important:

  1. Ensures that all features contribute equally to the model. When one feature has a large and…

Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here