7 Data Science Projects to Land a 6 Figure Job


7 Data Science Projects That Could Land You a Six-Figure Job
Image by Author

 

I’ve helped review many data science resumes.

Along the way, I’ve realized that the best candidates aren’t the ones who have graduated from top universities or have a Master’s degree.

They are the ones who show interest in the field.

The best data science candidates apply their knowledge of the field in creative ways, showcasing their skill through projects that stand out. For example, one junior data scientist built a stock prediction model for his personal use and went on to publish the project online. Although his model was simple, it showed employers that he was passionate about what he did. He took the skills he had and used it to build something of value.

Another candidate built a PowerBI dashboard that tracked his purchases and displayed a weekly summary of his spending behavior. Again, despite its simplicity, this project demonstrated the applicant’s ability to solve a real, personal problem with data.

I’ve seen these candidates get hired over other qualified applicants simply because of these projects, because it showed employers that they were able to build useful things with data, and that they were doing this because they enjoyed it.

In my experience, when it comes to entry-level candidates, employers tend to value creativity and passion over pure technical skills.

While technical skills can be taught, the desire to solve problems that go beyond your 9-5 is a great trait to have and is one that hiring managers love to see. In fact, four years ago, I landed my first data science internship with a large company solely because of the projects I displayed on my resume.

In this article, I’m going to share data science project ideas that will actually help you stand out. These are creative projects that solve problems with data, and I’ve included source code and tutorials to help you replicate them.

 

1. Credit Card Approval Project

 
This project was published on the Internet by a fellow data scientist, and I found it interesting. It is an application that predicts whether a person will get approved for a loan.

After building the model, the creator also deployed it on Streamlit for others to access:

 

Credit Card ApprovalCredit Card Approval
Image by Author

 

All you need to do is answer the questions on the app, and you will be told whether you qualify for a credit card. This project can easily be accessed by other people and solves a real problem, which is why I like it. If you are a junior data scientist, a simple project like this can help you stand out due to its interactivity.

Source Code: Credit Card Approval Prediction on GitHub

 

2. Personality Prediction Model

 
I created this project around 4 years ago, and it’s partly the reason I secured my first data science role.

 

Full Stack Machine Learning ProjectFull Stack Machine Learning Project
Image by Author

 

This project was a Dash application that could be accessed by other people. All you have to do is input a sentence, and the application will predict which Harry Potter character you are, based on your input. I built this because I was obsessed with Harry Potter and personality tests.

During my data science interview, I remember being asked specifically about this project since the hiring manager had tried using the application, and we had a great chat because they liked Harry Potter too.

This fun project that I created in my free time helped me get my first data science role!

Note: Dash is a tool that lets you turn your machine-learning models into web applications. I recommend this for portfolio projects because it allows you to create an entire app that people can interact with, instead of simply having to showcase a GitHub repository full of code.

Tutorial and Code: A Full Stack Machine-Learning Project

 

3. Celebrity Facial Recognition App

 
This is another project I created years ago, when I was teaching myself deep learning. I built an application that would let you upload a picture of yourself, and the app would predict which celebrity you looked like, as pictured below:

 

Facial Recognition ProjectFacial Recognition Project
Image by author

 

This is another fun project that showcased my interest in deep learning. I used Flask to create the web application and HTML, CSS, and JavaScript to build the frontend. As data scientists, we usually excel at writing procedural code using Python. However, I recommend going a step further and taking a tutorial or two in web design. This will help you deploy your project for others to interact with and is likely to capture the attention of potential employers.

I’ve also linked my source code below if you’d like to replicate this project.

Source Code: Facial Recognition App

 

4. Web Scraping Product Reviews

 
Web scraping is one of the most time-effective ways to collect third-party data. Companies often require external data to perform tasks like language modeling and market research.

In fact, when working with a past organization, I scraped data from the Internet and used it to fine-tune a domain-specific LLM. There are a multitude of business use cases for web scraping, and it is very likely that you’ll find yourself doing this as a data scientist. Therefore, I recommend building a web-scraping project and showcasing it in your portfolio.

I have built a project scraping book reviews on Amazon and will leave the code below if this is something you’d like to replicate.

Tutorial and Code: Web Scraping Book Reviews on Amazon

 

5. Customer Churn Prediction

 
Customer churn is a phenomenon in which users stop using a company’s services. This might be due to dissatisfaction, affordability, or poor customer service.

A customer churn prediction model allows companies to predict which customers are most likely to leave. Organizations can then put measures in place to prevent these customers from leaving. Customer churn prediction is one of the most popular data science use cases. It is frequently used in product companies (think Netflix, Spotify, and Uber), and will look great on your resume since it adds real business value.

Tutorial and Code: Customer Churn in Python

 

6. Data Science Job Dashboard

 
This data science job dashboard is something I found online, and is the passion project of a fellow data scientist.

 

Data Science Job Prediction DashboardData Science Job Prediction Dashboard
Image by author

 

Looking at the above dashboard, you can easily tell how many open data science positions there are, along with the top skills required in each role.

This is an end-to-end project that solves a real problem that job-seekers have. It answers pressing questions such as “What skills do I need to learn to become a data scientist” or “Should I learn R or Python?”

I’d recommend going through this project’s GitHub repository and possibly creating something similar.

Source Code: Job Prediction Dashboard

 

7. Uber Fare Prediction

 
I’ve always been a proponent of building projects that solve business problems. And what better way to achieve this than with a dataset from a large tech company?

The Uber Fares dataset has data on over a hundred thousand bookings on the platform. Your task is to predict the fare of each trip using variables like pickup and dropoff location.

I recommend going through the code notebooks for this dataset on Kaggle, as they display unique ways to approach this problem through various algorithms.

Dataset: Uber Fares Prediction Dataset
Code: Uber Fares Prediction in PyTorch

 

Next Steps

 
All 7 project ideas listed above are unique. They aren’t your standard, run-of-the-mill Kaggle project found in every applicant’s resume. Creating a project like the ones recommended above will help you stand out and double your chances of getting a data science job.

If you’d like more ideas on building data projects to land a job, I have an in-depth video guide on portfolio projects that you might find helpful.

Also, after creating around 2-3 projects, I recommend building a portfolio website to showcase them. This is the portfolio website that helped me land my first data science internship.
&nbsp
&nbsp

Natassha Selvaraj is a self-taught data scientist with a passion for writing. Natassha writes on everything data science-related, a true master of all data topics. You can connect with her on LinkedIn or check out her YouTube channel.

Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here