Top 5 Career Paths in Data Science and How to Self-Learn for Each



Image by Author | Canva

 

Data science offers almost infinite possibilities for career development. Mind you, these are possibilities. Data science is also an infamously tough field to break into. If you don’t have a Computer Science, Statistics, or a similar degree, it gets even tougher.

I’ll try to make it easier for you with this article. I will mention the most important topics and suggest several self-learning resources for data science career aspirants. However, don’t limit yourself. You can find many more YouTube tutorials, books, articles, and courses. Adapt the approach to your preferences, time, and money, but always keep in mind the main skills each role requires.

 

1. Data Analyst

 
In this role, you will analyze data, come up with insights, and help your employer make informed business decisions. You will mostly deal with cleaning, analyzing, and visually presenting data.

You will commonly use Excel, SQL, Python, and business intelligence tools.

 
Learning and career path for data analysts

 

How to Self-Learn

  1. Excel & BI Tools: Data analysts often work in Excel (or Google Sheets) with pivot tables, VLOOKUP, XLOOKUP, INDEX-MATCH, functions for data cleaning and aggregation, Power Query, and macros for automation.
    In addition, learn data validation, conditional formatting, and creating charts. Use Excel Practice Online as a learning resource.
    For BI tools, use Tableau and Power BI. Especially focus on DAX functions in Power BI and advanced dashboarding in Tableau. Learn these tools on Tableau Learning and Microsoft Learn for Power BI.
  2. SQL: Learn to query databases with SQL queries and concepts such as JOINs, data aggregation, filtering, subqueries, CTEs, and window functions. Learn from platforms like SQLBolt, Mode Analytics, LearnSQL.com, StrataScratch, and other platforms for SQL practice.
  3. Python: Focus on pandas and NumPy for data cleaning, manipulation, and calculation. Also, learn Matplotlib or seaborn for data visualization. Additionally, become proficient in exploratory data analysis (EDA) techniques and statistical analysis (SciPy).
    You can find courses on DataCamp or Kaggle, and analytical and visualization interview questions on StrataScratch. I also recommend the Python for Data Analysis book.
    While machine learning is usually outside the scope of data analysis, understanding the basics of ML models is always beneficial. Scikit-learn is the go-to tool here, and its documentation is a great resource for learning.
  4. Projects: Analyse datasets from Kaggle, Google Dataset Search, and Data.gov or solve actual take-home assignments on StrataScratch.

 

Career Path

After starting as a data analyst, you can become a senior data analyst, an analytics manager, or a data scientist.

 

2. Machine Learning Engineer

 
ML engineers build, deploy, and optimize ML models. They achieve that by employing algorithms, and use deep learning frameworks and cloud-based ML tools. They also focus on data preprocessing, feature engineering, model evaluation, and deployment strategies, e.g., containerisation with Docker and orchestration with Kubernetes.

 
Learning and career path for machine learning engineers

 

How to Self-Learn

  1. Python & ML Libraries: Master scikit-learn, TensorFlow, and PyTorch through courses such as Machine Learning Specialization, HarvardX: Data Science: Machine Learning, and PyTorch for Deep Learning Bootcamp: Zero to Mastery.
    Additionally, learn Hugging Face Transformers for NLP applications and experiment with reinforcement learning frameworks such as Stable-Baselines3.
  2. Mathematics: Expand your linear algebra, probability, and statistics knowledge with Khan Academy or the books such as Pattern Recognition and Machine Learning or Mathematics for Machine Learning.
    Important topics also include gradient descent, backpropagation, and convex optimization (read Convex Optimization)
  3. Model Deployment: Learn tools like Flask, FastAPI, AWS, Google Cloud, and Azure. Don’t forget about the MLOps tools like MLflow and Kubeflow, and model monitoring techniques.
  4. Projects: Implement classification, regression, and DL projects on StrataScratch or using datasets from resources linked earlier.

 

Career Path

Start as a machine learning engineer, and advance to senior ML engineer, ML architect, or AI specialist. With further expertise, move into AI research, technical leadership, or consulting roles.

 

3. Data Engineer

 
Data engineers ensure data is stored, processed, and available for other users. They work with structured and unstructured data and warehousing solutions to build ETL, ELT, and real-time streaming data pipelines.

 
Data Engineer Career Path

 

How to Self-Learn

These courses will give you solid foundations:

Also, try with Big Book of Data Engineering, Fundamentals of Data Engineering, or Data Engineering with Python books.

  1. SQL & Databases: You must be proficient with relational databases, i.e., PostgreSQL, MySQL, MS SQL Server, or Oracle. Pay attention to indexing, partitioning, and query optimization.
  2. Python & Spark: You’ll need pandas, PySpark, and workflow orchestration tools like Apache Airflow and Apache Kafka.
    Be familiar with database replication, distributed computing frameworks (e.g., Apache Spark, Dask, and Ray), and data lake architectures (e.g., AWS S3, Delta Lake, and Apache Iceberg).
  3. Cloud & Big Data Tools: Work with cloud computing and big data in AWS Redshift, Google BigQuery, or Snowflake. Become familiar with infrastructure as code (IaC) with Terraform and other automation tools for cloud data engineering, e.g., Apache Airflow, AWS Lambda, Google Cloud Composer, Azure Data Factory, dbt Labs, or Kubernetes.
  4. Projects: Work on projects that involve the above skills, e.g., From Web Scraping to Tableau, Realtime Data Streaming, SQL Data Warehouse from Scratch, Airflow Data Pipeline, or Outliers Detection.

Also, use datasets from sources linked in previous sections to create your own projects.

 

Career Path

You typically start as a junior data engineer or software engineer. You can advance to data engineer, data architect, and cloud data engineer roles. You can also specialise in big data, real-time processing, or cloud infrastructure.

 

4. Data Scientist

 
Data scientists use statistical analysis and ML to extract insights from structured and unstructured data. They engineer features, evaluate models, perform A/B tests, and build automated decision-making systems.

 
Data Scientist Career Path

 

How to Self-Learn

  1. Programming & ML: Python is data scientists’ primary tool, along with libraries like pandas, NumPy, scikit-learn, TensorFlow, and PyTorch.
    Helpful courses are Python for Data Science, AI & Development, TensorFlow Curriculums, and Data Science: Machine Learning.
    Use books such as Hands-on Machine Learning with Scikit-Learn, Keras & TensorFlow and Data Science and Machine Learning.
  2. Statistics & Probability: Important concepts are probability distributions, hypothesis testing, Bayesian inference, and statistical significance.
    To learn these, try courses like Statistics for Data Science with Python or Probability & Statistics for Machine Learning & Data Science and books like An Introduction to Statistical Learning, The Elements of Statistical Learning or Data Science and Machine Learning.
  3. Data Visualization: Learn Python libraries such as Matplotlib, seaborn, and Plotly. Also, familiarize yourself with Tableau and Power BI.
  4. Big Data & Cloud Tools: Ensure you’re proficient in big data and cloud tools like Apache Spark, AWS, GCP, or Azure.
  5. Projects: Work on projects involving the abovementioned skills and tools. Many such projects are available on StrataScratch, ProjectPro, and GitHub. You can also use public datasets from sources linked in previous sections.

 

Career Path

The starting point is often a data analyst job. From there, you can become a data scientist, senior data scientist, principal data scientist, or lead data scientist. That can also further lead to data science consulting or leadership roles like chief data officer.

 

5. AI Researcher

 
AI researchers work on developing new AI algorithms, and they often focus on deep learning, NLP, reinforcement learning, and generative AI. They also improve model architecture, existing training methodologies, and optimization techniques. They frequently collaborate with academic institutions, corporate AI research divisions (e.g., DeepMind, OpenAI, and Google Brain), and industry labs (e.g., Microsoft Research, FAIR, and IBM Research).

 
AI Researcher Career Path

 

How To Self-Learn

  1. Mathematics: Work on linear algebra, calculus, and optimization. Study convex optimization, probability theory, and statistical inference to understand advanced ML concepts. Use the resources linked in the ML Engineer section.
  2. Deep Learning: Learn DL by taking Fast.ai’s and DeepLearning.AI’s specialisations, and reading Deep Learning or Dive Into Deep Learning books. Explore transformer architectures, generative adversarial networks (GANs), and reinforcement learning frameworks like Stable-Baselines3.
  3. Research Papers: Read and implement research papers on recent breakthroughs in deep learning, meta-learning, and self-supervised learning. Find them on arXiv, Google Research, and OpenAI.
  4. Projects: Contribute to open-source AI projects on GitHub, e.g., TensorFlow, PyTorch, Hugging Face Transformers, or Gymnasium.

 

Career Path

Starting as a research assistant or junior researcher. Then move to research scientist, AI researcher, or academic roles. Many AI researchers work in top tech companies and research labs.

 

Conclusion

 
No matter your path, don’t expect it to be easy to break in. It’s important you don’t have an illusion it’ll be a walk in the park. However, with the right learning approach, focusing on the essential skills, and investing yourself (and sometimes a little bit of money), there’s more chance to be on one of these five career paths, whichever you prefer.
 
 

Nate Rosidi is a data scientist and in product strategy. He’s also an adjunct professor teaching analytics, and is the founder of StrataScratch, a platform helping data scientists prepare for their interviews with real interview questions from top companies. Nate writes on the latest trends in the career market, gives interview advice, shares data science projects, and covers everything SQL.



Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here