Missing Data in Time-Series? Machine Learning Techniques (Part 2) | by Sara Nóbrega | Jan, 2025


Employ cluster algorithms to handle missing time-series data

Towards Data Science
Image by Author.

(If you haven’t read Part 1 yet, check it out here.)

Missing data in time-series analysis is a recurring problem.

As we explored in Part 1, simple imputation techniques or even regression-based models-linear regression, decision trees can get us a long way.

But what if we need to handle more subtle patterns and capture the fine-grained fluctuation in the complex time-series data?

In this article we will explore K-Nearest Neighbors. The strengths of this model include few assumptions with regards to nonlinear relationships in your data; hence, it becomes a versatile and robust solution for missing data imputation.

We will be using the same mock energy production dataset that you’ve already seen in Part 1, with 10% values missing, introduced randomly.

We will impute missing data in using a dataset that you can easily generate yourself, allowing you to follow along and apply the techniques in real-time as you explore the process step by step!

Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here