Mathematics for Machine Learning: Variance, Covariance, and Covariance Matrices | by Gary Drocella | Feb, 2025

In machine learning, it is common to want to perform dimensionality reduction on a dataset so that the dataset becomes easier to visualize and reason about. Performing dimensionality reduction can take datasets with thousands of features and narrow it down to one or several features. Being able to compute the Eigenvalues and Eigenvectors of a covariance matrix is an important step in dimensionality reduction, which motivates us to learn about variance, covariance, and covariance matrices.

Finding the Mean of a Dataset

Given a dataset of x and y points, you can compute the average of the data with the following formulas:

This will calculate a middle point of the data:

Finding the Variance of a Dataset

The variance of a dataset measures how spread out the data is, and it can be computed given the following formula:

It is a measure of the average square distance from the mean.

Finding the Covariance of Two Features of a Dataset

The covariance measures the relationship between two features of a dataset. A positive covariance means that the data is trending upwards, a negative covariance means that the data is trending downwards, and a…

Mathematics for Machine Learning: Variance, Covariance, and Covariance Matrices | by Gary Drocella | Feb, 2025

Recent Articles

Australia bans the use of Kaspersky products by government entities

Integrating Machine Learning in iOS Apps with Core ML: A Comprehensive Guide | by Debashish kumar sahoo | Feb, 2025

SwiftUI Apprentice | Kodeco

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases

Learn How to Identify High-Risk Identity Gaps and Slash Security Debt in 2025

Related Stories

Leave A Reply Cancel reply