Learnings from a Machine Learning Engineer — Part 3: The Evaluation | by David Martin | Jan, 2025

Practical insights for a data-driven approach to model optimization

In this third part of my series, I will explore the evaluation process which is a critical piece that will lead to a cleaner data set and elevate your model performance. We will see the difference between evaluation of a trained model (one not yet in production), and evaluation of a deployed model (one making real-world predictions).

In Part 1, I discussed the process of labelling your image data that you use in your image classification project. I showed how to define “good” images and create sub-classes. In Part 2, I went over various data sets, beyond the usual train-validation-test sets, such as benchmark sets, plus how to handle synthetic data and duplicate images.

Evaluation of the trained model

As machine learning engineers we look at accuracy, F1, log loss, and other metrics to decide if a model is ready to move to production. These are all important measures, but from my experience, these scores can be deceiving especially as the number of classes grows.

Although it can be time consuming, I find it very important to manually review the images that the model gets wrong, as well as the…

Learnings from a Machine Learning Engineer — Part 3: The Evaluation | by David Martin | Jan, 2025

Practical insights for a data-driven approach to model optimization

Evaluation of the trained model

Recent Articles

Apple brings Store app to Indian market

Artificial General Intelligence (AGI) | by Kaushani Rasadika | Jan, 2025

Chat with Your Documents Using Retrieval-Augmented Generation (RAG)

Hacker games, AI travel surveillance, and 25 years of IoT • Graham Cluley

The 5 best Garmin watches for training and everyday life

Related Stories

Leave A Reply Cancel reply