Even with zero math background
Do you want to become a Data Scientist or machine learning engineer, but you feel intimidated by all the math involved? I get it. I’ve been there.
I dropped out of High School after 10th grade, so I never learned any math beyond trigonometry in school. When I started my journey into Machine Learning, I didn’t even know what a derivative was.
Fast forward to today, and I’m an Applied Scientist at Amazon, and I feel pretty confident in my math skills.
I’ve picked up the necessary math along the way using free resources and self-directed learning. Today I’m going to walk you through some of my favorite books, courses, and YouTube channels that helped me get to where I am today, and I’ll also share some tips on how to study effectively and not waste your time struggling and being bored.
Do You Even Need to Know Math for ML?
First, let’s address a common question: Do you even really need to know the math to work in ML?
The short answer is: it depends on what you want to do.
For research-heavy roles where you’re creating new ML algorithms, then yes, you obviously need to know the math. But if you’re asking yourself if you need to learn math, chances are that’s not the kind of job you’re looking for…
But for practitioners — most of us in the industry — you can often be totally competent without knowing all the underlying details, especially as a beginner.
At this point, libraries like numpy, scikit-learn, and Tensorflow handle most of the heavy lifting for you. You don’t need to know the math behind gradient descent to deploy a model to production.
If you’re a beginner trying to get into ML, in my opinion it is not strategic to spend a bunch of time memorizing formulas or studying linear algebra — you should be spending that time building things. Train a simple model. Explore your data. Build a pipeline that predicts something fun.
That said, there are moments where knowing the math really helps. Here are a few examples:
Imagine you’re training a model and it’s not converging. If you understand concepts like gradients and optimization functions, you’ll know whether to adjust your learning rate, try a different optimizer, or tweak your data preprocessing.
Or, let’s say you’re running a linear regression, and you’re interpreting the coefficients. Without math knowledge, you might miss problems like multicollinearity, which makes those coefficients unreliable. Then you make incorrect conclusions from the data and cost the company millions and lose your job! Just kidding. Kind of. We do need to be careful when making business decisions from the models we build.
So, while you can (and should) get started without deep math knowledge, it’s definitely still reasonable to build your comfort with math over time.
Once you’re hands-on, you’ll start encountering problems that naturally push you to learn more. When you need to debug or explain your results, that’s when the math will start to click, because it’s connected to real problems.
So seriously, don’t let the fear of math stop you from starting. You don’t need to learn it all upfront to make progress. Get your hands dirty with the tools, build your portfolio, and let math grow as a skill alongside your practical knowledge.
What to Learn
Alright, now let’s talk about what to learn when you’re building your math foundation for Machine Learning jobs.
First, linear algebra.
Linear algebra is fundamental for Machine Learning, especially for deep learning. Many models rely on representing data and computations as matrices and vectors. Here’s what to prioritize:
- Matrices and Vectors: Think of matrices as grids of numbers and vectors as lists. Data is often stored this way, and operations like addition, multiplication, and dot products are central to how models process that information.
- Determinants and Inverses: Determinants tell you whether a matrix can be inverted, which is used in optimization problems and solving systems of equations.
- Eigenvalues and Eigenvectors: These are key to understanding variance in data and are the foundation of techniques like Principal Component Analysis, which helps reduce dimensionality in datasets.
- Lastly, Matrix Decomposition: Methods like Singular Value Decomposition (SVD) are used in recommendation systems, dimensionality reduction, and data compression.
Now we’re on to basic calculus.
Calculus is core to understanding how models learn from data. But, we don’t need to worry about solving complex integrals — it’s just about grasping a few key ideas:
- First, derivatives and gradients: Derivatives measure how things change, and gradients (which are multidimensional derivatives) are what power optimization algorithms like gradient descent. These help models adjust their parameters to minimize error.
- The Chain Rule is central to neural networks. It’s how backpropagation works — which is the process of figuring out how much each weight in the network contributes to the overall error so the model can learn effectively.
- Lastly, optimization basics: Concepts like local vs. global minima, saddle points, and convexity are important to understand why some models get stuck and others find the best solutions.
Lastly, statistics and probability.
Statistics and probability are the bread and butter of understanding data. While they’re more associated with data science, there’s definitely a lot of value for ML as well. Here’s what you need to know:
- Distributions: Get familiar with common ones like normal, binomial, and uniform. The normal distribution, in particular, pops up everywhere in data science and ML.
- Variance and covariance: Variance tells you how spread out your data is, while covariance shows how two variables relate. These concepts are really important for feature selection and understanding your data’s structure.
- Bayes’ Theorem: While it has kind of an intimidating name, Bayes’ theorem is a pretty simple but powerful tool for probabilistic reasoning. It’s foundational for algorithms like Naive Bayes — big surprise — which is used for things like spam detection, as well as for Bayesian optimization for hyperparameter tuning.
- You’ll also want to understand Maximum Likelihood Estimation (MLE), which helps estimate model parameters by finding values that maximize the likelihood of your data. It’s a really fundamental concept in algorithms like logistic regression.
- Finally, sampling and conditional probability: Sampling lets you work with subsets of data efficiently, and conditional probability is essential for understanding relationships between events, especially in Bayesian methods.
Now, this is definitely not exhaustive, but I think it’s a good overview of the common concepts you’ll need to know to do a good job as a data scientist or MLE.
Next up, I’ll share the best resources to learn these concepts without it being stressful or overwhelming.
Resources
Personally, I would highly recommend starting with a visual and intuitive understanding of the key concepts before you start reading difficult books and trying to solve equations.
For Linear Algebra and Calculus, I cannot speak highly enough about 3blue1brown’s Essence of Linear Algebra and Essence of Calculus series. These videos give a solid introduction to what is actually being measured and manipulated when we use these mathematical approaches. More importantly, they show, let’s say, the beauty in it? It’s strange to say that math videos could be inspirational, but these ones are.
For statistics and probability, I am also a huge fan of StatQuest. His videos are clear, engaging, and just a joy to watch. StatQuest has playlists with overviews on core stats and ML concepts.
So, start there. Once you have a visual intuition, you can start working through more structured books or courses.
There are lots of great options here. Let’s go through a few that I personally used to learn:
I completed the Mathematics for Machine Learning Specialization from Imperial College London on Coursera when I was just starting out. The specialization is divided into three courses: Linear Algebra, Multivariate Calculus, and a last one on Principal Component Analysis. The courses are well-structured and include a mix of video lectures, quizzes, and programming assignments in Python. I found the course to be a bit challenging as a beginner, but it was a really good overview and I passed with a bit of effort.
DeepLearning.AI also recently released a Math for ML Specialization on Coursera. This Specialization also has courses on Linear Algebra and Calculus, but instead of PCA the final course focuses on Stats and Probability. I’m personally working through this Specialization right now, and overall I’m finding it to be another really great option. Each module starts with a nice motivation for how the math connects to an applied ML concept, it has coding exercises in Python, and some neat 3D tools to mess around with to get a good visual understanding of the concepts.
If you prefer learning from books, I have some suggestions there too. First up, if you like anime or nerdy stuff, oh boy do I have a recommendation for you.
Did you know they have manga math books?
The Manga Guide to Calculus
The Manga Guide to Linear Algebra
The Manga Guide to Statistics
These are super fun. I can’t say that the instructional quality is world-class or anything, but they are cute and engaging, and they made me not dread reading a math book.
The next level up would be “real” math books. These are some of the best:
The Mathematics for Machine Learning ebook by Deisenroth and colleagues is a great comprehensive resource available for free for personal use. It covers key topics we’ve already discussed like Linear Algebra, Calculus, Probability, and Optimization, with a focus on how these concepts apply to machine learning algorithms. It’s relatively beginner-friendly and is generally regarded as one of the best books for learning this material.
Next, Practical Statistics for Data Scientists is another well-loved resource that includes code examples in Python and R.
How to Study
Now, before we actually start studying, I think it’s important to spend a little bit of time thinking really deeply about why you even want to do this. Personally, I find that if I’m studying just because I feel like I “should,” or because it’s some arbitrary assignment, I get distracted easily and don’t actually retain much.
Instead, I try to connect to a deeper motivation. Personally, right now I have a really basic motivation: I want to earn a lot of money so that I can take care of everyone I love. I have this opportunity to push myself and make sure everyone is safe and cared for, now and in the future. This isn’t to put extra pressure on myself, but actually just a way that works for me to get excited that I have this opportunity to learn and grow and hopefully help others along the way. Your motivation might be totally different, but whatever it is, try to tie this work to a larger goal.
In terms of strategies for optimizing your study time, I have found that one of the most effective methods is writing notes in my own words. Don’t just copy definitions or formulas — take time to summarize concepts as if you were explaining them to someone else — or, to future you. For example, if you’re learning about derivatives, you might write, “A derivative measures how a function changes as its input changes.” This forces you to actively process the material.
Relatedly, when it comes to math formulas, don’t just stare at them — translate them into plain English — or whatever spoken language you prefer. For instance, take the equation y=mx+b: you might describe m as “the slope that shows how steep the line is,” and b as “the point where the line crosses the y-axis.” So, the final formula, might be, “The value of y (the output) is determined by taking the slope (m), multiplying it by x (the input), and then adding b (the starting point where the line intersects the y-axis).”
You can even use your notes as like a personal blog. Writing short posts about what you’ve learned is a really solid way to clarify your understanding, and teaching others (even if no one reads it) solidifies the material in your own mind. Plus, sharing your posts on Medium or LinkedIn not only potentially helps others but also allows you to build a portfolio showcasing your learning journey.
Also trust me, when it’s interview time you’ll be happy you have these notes! I use my own study notes all the time.
This next piece of advice I have might not be super fun, but I also recommend not using just one resource. Personally I’ve had a lot of success from taking many different courses, and kind of throwing all my notes together at first. Then, I’ll write a blog like I was just talking about that summarizes all of my learnings.
There are a couple of advantages to this approach: First, repetition helps you retain things. If I see a concept multiple times, explained from multiple angles, I’m much more likely to actually get what’s going on and remember that for longer than a day. Plus, not only do I see the information presented to me multiple times, I’m writing the concepts out in my own words multiple times, including that final time where I synthesize it all and get it ready to share with others — so I have to be really confident I actually got it by the end.
Finally, once you’ve built that foundation and get to the level of math where you can actually use it for stuff, I really recommend coding concepts from scratch. If you can code gradient descent or logistic regression using just numpy, you’re off to a really strong start.
Again, Math (Probably) Won’t Get You a Job
While I know at this point you’re super excited to start learning math, I do want to just circle back to the important fact that if you’re a beginner trying to get your first job, in my opinion math should not be the first thing you prioritize.
It is really unlikely that your math skills are what will get you a job as a data scientist or MLE.
Instead, prioritize gaining hands-on experience by working on projects and actually building stuff. Employers are far more interested in seeing what you can do with the tools and knowledge you already have than how many formulas you’ve memorized.
As you encounter challenges in your work, you’ll naturally be motivated to learn the math behind the algorithms. Remember, math is a tool to help you succeed, and shouldn’t be a barrier to getting started.
—
If you want more advice on how to break into data science, you can download a free 80+ page e-book on how to get your first data science job (learning resources, project ideas, LinkedIn checklist, and more): https://gratitudedriven.com/
Or, check out my YouTube channel!
Finally, just a heads up, there are affiliate links in this post. So, if you buy something I’ll earn a small commission, at no additional cost to you. Thank you for your support.