Online Course popular
The Machine Learning Process A-Z

Master the complete machine learning lifecycle: from problem definition to model deployment in production

4.8

862 reviews on
18,002 students already enrolled
  • Institute of Analytics
  • The Association of Data Scientists
  • E-Learning Quality Network
  • European Agency for Higher Education and Accreditation
  • Global Association of Online Trainers and Examiners

Skill level:

Advanced

Duration:

6 hours
  • Lessons (6 hours)

CPE credits:

7.5
CPE stands for Continuing Professional Education and represents the mandatory credits a wide range of professionals must earn to maintain their licenses and stay current with regulations and best practices. One CPE credit typically equals 50 minutes of learning. For more details, visit NASBA's official website: www.nasbaregistry.org

Accredited

certificate

What you learn

  • Acquire real-world skills to deliver machine learning results.
  • Master all stages of the machine learning lifecycle.
  • Gain hands-on experience in data preprocessing for ML.
  • Improve your ML model’s results with advanced feature engineering.
  • Manage and execute complete ML projects independently.

Topics & tools

Machine LearningModel EvaluationData PreprocessingMachine Learning ProcessData ModelingDealing with Imbalanced DataCross ValidationFeature EngineeringExploratory Data AnalysisMachine and Deep LearningPython

Your instructor

Course OVERVIEW

Description

CPE Credits: 7.5 Field of Study: Information Technology
Delivery Method: QAS Self Study
Data science education focuses too much on the algorithm itself. In reality, we can have only four lines of code and use them for a variety of problems. The heavy lift of an ML model is the end-to-end process. Jeff Li and Ken Jee walk you step-by-step through this process, so you can successfully take your next project from start to finish. You will learn everything you need to know to set your projects up for success. The Machine Learning Process A-Z course gives you a deep understanding of what machine learning really is. It helps you understand when you should and shouldn’t use this powerful tool. Jeff and Ken break down the specifics of the different problems you can encounter and how machine learning is used in specific domains. In the second part of the course, you will learn the entire modeling process. Jeff and Ken show you how to pull real results and make the ML model work for others, not just yourself. You will learn how to perform essential steps like data preprocessing. In this section, they also show you how to deal with null values and outliers. Next, you’ll see how to explore your data to frame your analysis. Additionally, the course deals with some of the visualization techniques that can help you to see the relationships in your data. After that, we go into feature engineering—one of the most important steps for improving your model’s results. That leads to cross-validation and how to handle bias and variance trade-off in your analysis. Finally, the instructors touch briefly on the model tuning process and how to productionize your work and documentation.

Prerequisites

  • Basic understanding of machine learning concepts.

Curriculum

145 lessons 1 exam
  • 1. Course Introduction
    21 min
    Machine learning has multiple benefits. The first is consistency—аn algorithm can evaluate hundreds of variables, find indiscernible relationships, and make the same prediction every time. Humans, on the other hand, can become overwhelmed by the amount of information and make erratic predictions. When dealing with high volumes of data, a human may come to different conclusions about predicted outcomes from the same dataset without realizing it.
    21 min
    Machine learning has multiple benefits. The first is consistency—аn algorithm can evaluate hundreds of variables, find indiscernible relationships, and make the same prediction every time. Humans, on the other hand, can become overwhelmed by the amount of information and make erratic predictions. When dealing with high volumes of data, a human may come to different conclusions about predicted outcomes from the same dataset without realizing it.
    Introduction Free
    ML Process Course - GitHub repository Free
    Meet your instructors Free
    How to use this course Free
    Additional resources Free
    Environment setup Free
    Setting up Colab notebooks Free
    Setting up notebooks locally Free
    Setting up your flashcards - Explainer video Free
    Setting up your flashcards - Link Free
    Why Machine Learning Free
    Why learn the ML process? Free
  • 2. Intro to Machine Learning
    15 min
    Learn when you should and shouldn’t use machine learning. We teach you when in your organization’s lifecycle to implement ML and what real-life problems you can solve. We also cover the types of ML problems and show you the difference between product analytics vs. data products.
    15 min
    Learn when you should and shouldn’t use machine learning. We teach you when in your organization’s lifecycle to implement ML and what real-life problems you can solve. We also cover the types of ML problems and show you the difference between product analytics vs. data products.
    Real-world ML examples Free
    Applying ML Free
    When is ML needed? Free
    When do we use ML? Free
    When to avoid using ML? Free
    Real-life ML process Free
    Supervised learning vs. Unsupervised learning Free
    Regression vs. Classification Free
    ML for Product analytics Free
    ML for Data products Free
  • 3. The Modeling Process
    8 min
    The most important step in machine learning is not any fancy technique—it’s properly defining the problem. An incorrect definition can result in months—or even years—of wasted resources. That is why, before we can leverage the power of ML, we need a reason to use it. Whether it’s a directive from a manager, a pain point from a stakeholder, or a gap in the business, you need to properly frame the problem first and foremost.
    8 min
    The most important step in machine learning is not any fancy technique—it’s properly defining the problem. An incorrect definition can result in months—or even years—of wasted resources. That is why, before we can leverage the power of ML, we need a reason to use it. Whether it’s a directive from a manager, a pain point from a stakeholder, or a gap in the business, you need to properly frame the problem first and foremost.
    Problem framing
    Understanding the problem
    Defining the problem
    Business impact
    Mapping out solutions
    Understanding the data
    Checklist
  • 4. Data Collection
    8 min
    Most projects won’t give you the data in a neat and tidy .csv file like they do in class or on Kaggle. If you work at a big company, it is likely that you will have plenty of data readily available to you. However, you’ll still need to pull it yourself. In this section, we will talk about data collection techniques from most structured to least structured.
    8 min
    Most projects won’t give you the data in a neat and tidy .csv file like they do in class or on Kaggle. If you work at a big company, it is likely that you will have plenty of data readily available to you. However, you’ll still need to pull it yourself. In this section, we will talk about data collection techniques from most structured to least structured.
    Data collection overview
    Databases
    APIs
    Online data capture
    Web scraping
    Survey
  • 5. Data Preprocessing
    35 min
    After defining our problem, we need to preprocess our data. Keep in mind that different models have different requirements. For example, linear regression cannot handle null values, whereas some tree-based models can. Additionally, deviant data points can significantly skew the results of some models and not others, e.g., linear models are affected by outliers whereas tree-based models—not as much. In that case, you may want to remove or adjust outliers in order to get the best performance from our models. After all, we want to make sure our data is of the highest quality before we start building our models.
    35 min
    After defining our problem, we need to preprocess our data. Keep in mind that different models have different requirements. For example, linear regression cannot handle null values, whereas some tree-based models can. Additionally, deviant data points can significantly skew the results of some models and not others, e.g., linear models are affected by outliers whereas tree-based models—not as much. In that case, you may want to remove or adjust outliers in order to get the best performance from our models. After all, we want to make sure our data is of the highest quality before we start building our models.
    Intro
    Dealing with null values
    Types of null values
    Missing values - Coding examples
    Missing values - Coding portion
    Strategies for handling null values
    Dealing with outliers
    Outliers - Coding portion
    Implications of outliers
    Detecting outliers
    Causes of outliers
    Treating outliers
    Outliers - Coding examples
  • 6. Exploratory Data Analysis for ML
    52 min
    Exploratory data analysis is an important prerequisite to the modeling process. With EDA, we get a feel for the assumptions so we can choose the best model and get the most accurate results. The reason we perform this step is to understand the shape of our data and the relationships between the variables.
    52 min
    Exploratory data analysis is an important prerequisite to the modeling process. With EDA, we get a feel for the assumptions so we can choose the best model and get the most accurate results. The reason we perform this step is to understand the shape of our data and the relationships between the variables.
    Overview
    Why you need EDA
    How to approach EDA
    Distributions and single variable plots
    Relationships and multi variable plots
    Scatterplots
    Correlation
    Correlation matrix
    Bar charts
    Line charts
    Pivot table
    How to generate useful insights
    Basic EDA - Coding examples
    Basic EDA - Coding portion (1 of 2)
    Basic EDA - Coding portion (2 of 2)
  • 7. Feature Engineering
    44 min
    The feature engineering step will likely reap the highest accuracy gains in our modeling process—even more than the selection of the algorithm itself. The core principle is to find variables that have predictive power. So, regardless of how complex the feature is or how fancy of a technique, if it does not add predictive power to our model, then it doesn’t do anything for your model. Remember: complexity doesn’t create value. Creative and domain-specific features create value.
    44 min
    The feature engineering step will likely reap the highest accuracy gains in our modeling process—even more than the selection of the algorithm itself. The core principle is to find variables that have predictive power. So, regardless of how complex the feature is or how fancy of a technique, if it does not add predictive power to our model, then it doesn’t do anything for your model. Remember: complexity doesn’t create value. Creative and domain-specific features create value.
    Intro
    Categorical features
    Categorical features - Coding examples
    Categorical features - Coding portion
    One-hot encoding
    Ordinal encoding
    Frequency encoding
    Target encoding
    Probability ratio encoding
    Weight of evidence encoding
    Binning
    Feature engineering for continuous variables
    Scaling
    Normalization
    MinMax scaling
    Z-score normalization
    Robust scaler
    Types of transformations
    Logarithmic transformations
    Exponential transformations
    Square root transformations
    Box-Cox transformations
    Arithmetic interactions
    Binning
    Creative features
    Continuous feature scaling - Coding examples
    Continuous feature scaling - Coding portion
  • 8. Cross Validation
    33 min
    Cross-validation allows us to test our model on data that the model has not been trained on. If it performs well on this unseen data, we can be more confident that our predictions will generalize well.
    33 min
    Cross-validation allows us to test our model on data that the model has not been trained on. If it performs well on this unseen data, we can be more confident that our predictions will generalize well.
    Cross validation
    Train-test-split
    Overfitting and underfitting
    Bias/Variance tradeoff
    Bias
    Variance
    K-fold cross validation
    Leave one out cross validation
    Time series cross validation
    Monte carlo cross validation
    Cross validation - Coding portion
    Cross validation - Coding examples
  • 9. Feature Selection
    18 min
    Feature selection is the process of reducing the model’s complexity by selecting only the highest signal features in the dataset. Typically, there are three classes of feature selection techniques. Wrapper methods require us to exhaustively search through all features and measure performance. They give us the best accuracy but are the most computationally expensive out of the three. Next, we have filter methods that pick up feature properties through univariate statistics. These tend to be faster and easier to check since they don’t rely on cross-validation. Lastly, the embedded methods are baked into the ML algorithm. We cover these in the companion course, Machine Learning Algorithms.
    18 min
    Feature selection is the process of reducing the model’s complexity by selecting only the highest signal features in the dataset. Typically, there are three classes of feature selection techniques. Wrapper methods require us to exhaustively search through all features and measure performance. They give us the best accuracy but are the most computationally expensive out of the three. Next, we have filter methods that pick up feature properties through univariate statistics. These tend to be faster and easier to check since they don’t rely on cross-validation. Lastly, the embedded methods are baked into the ML algorithm. We cover these in the companion course, Machine Learning Algorithms.
    Feature selection
    Feature selection - Coding examples
    Feature selection - Coding portion
    Wrapper methods
    Filter methods
    Embedded methods
  • 10. Dealing with Imbalanced Data
    26 min
    How do we deal with imbalance in our data? This is where oversampling, undersampling, and other techniques become useful. We show you how to perform these to optimize your model’s performance.
    26 min
    How do we deal with imbalance in our data? This is where oversampling, undersampling, and other techniques become useful. We show you how to perform these to optimize your model’s performance.
    Intro
    Random undersampling
    Random oversampling
    Synthetic minority oversampling
    Borderline SMOTE
    Safe-level SMOTE
    Adaptive synthetic oversampling
    Dealing with imbalanced data - Coding examples
    Dealing with imbalanced data - Coding portion
  • 11. Modeling
    47 min
    In this section of the course, we’ll focus on ML modeling techniques such as the baseline model, model selection, hyperparameter tuning, and assembling.
    47 min
    In this section of the course, we’ll focus on ML modeling techniques such as the baseline model, model selection, hyperparameter tuning, and assembling.
    Modeling
    Creating a model baseline
    Model selection
    Parameter tuning
    Ensemble models
    The ML modelling process basics - Coding examples
    The ML modelling process basics - Coding portion
  • 12. Model evaluation
    53 min
    We can’t measure the value of our model without proper evaluation criteria. Earlier in this course, we touched on two types of Supervised ML problems—regression and classification. Since these problems use different target variables, we use different metrics to evaluate the quality of the model.
    53 min
    We can’t measure the value of our model without proper evaluation criteria. Earlier in this course, we touched on two types of Supervised ML problems—regression and classification. Since these problems use different target variables, we use different metrics to evaluate the quality of the model.
    Model evaluation
    Classification evaluation
    Classification metrics - Coding examples
    Precision and Recall
    Accuracy, Precision, Recall - Coding portion
    ROC curves
    ROC AUC
    F1-Score
    PR-AUC
    Classification metrics - Coding portion
    Log loss
    Regression
    Regression metrics - Coding examples
    R-squared / Adjusted R-squared
    Mean absolute error
    Root mean squared error
    Regression notebook - Coding portion
  • 13. Productionization
    7 min
    The goal of our work is to make it useful to other people. How do we do that? Through something called productionization. It is a little broader than simply deploying the model into your product; rather it is the delivery of our model to our end user in whatever form.
    7 min
    The goal of our work is to make it useful to other people. How do we do that? Through something called productionization. It is a little broader than simply deploying the model into your product; rather it is the delivery of our model to our end user in whatever form.
    Saving your models
    Types of production outputs
    Model maintenance
  • 14. Conclusion
    6 min
    As the course goes through different steps of the modeling process, you’re probably wondering, “How am I supposed to remember each step?” The good news is that building an ML model typically follows a very similar process each time: (1) Problem Framing/Business Understanding; (2) Data Cleaning/Preparation; (3) Exploratory Data Analysis; (4) Feature Engineering; (5) Cross-Validation; (6) Modeling; (7) Evaluation
    6 min
    As the course goes through different steps of the modeling process, you’re probably wondering, “How am I supposed to remember each step?” The good news is that building an ML model typically follows a very similar process each time: (1) Problem Framing/Business Understanding; (2) Data Cleaning/Preparation; (3) Exploratory Data Analysis; (4) Feature Engineering; (5) Cross-Validation; (6) Modeling; (7) Evaluation
    Conclusion
  • 15. Course exam
    40 min
    40 min
    Course exam

Free lessons

Introduction

1.1 Introduction

3 min

ML Process Course - GitHub repository

1.2 ML Process Course - GitHub repository

1 min

Meet your instructors

1.3 Meet your instructors

1 min

How to use this course

1.4 How to use this course

1 min

Additional resources

1.5 Additional resources

1 min

Environment setup

1.6 Environment setup

1 min

Start for free

9 in 10

of our graduates landed a new AI & data job

after enrollment

9 in 10

people walk away career-ready

with practical data and AI skills.

94%

of AI and data science graduates

successfully change

or advance their careers.

ACCREDITED certificates

Craft a resume and LinkedIn profile you’re proud of—featuring certificates recognized by leading global institutions.

Earn CPE-accredited credentials that showcase your dedication, growth, and essential skills—the qualities employers value most.

  • Institute of Analytics
  • The Association of Data Scientists
  • E-Learning Quality Network
  • European Agency for Higher Education and Accreditation
  • Global Association of Online Trainers and Examiners

Certificates are included with the Self-study learning plan.

A LinkedIn profile mockup on a mobile screen showing Parker Maxwell, a Certified Data Analyst, with credentials from 365 Data Science listed under Licenses & Certification. A 365 Data Science Certificate of Achievement awarded to Parker Maxwell for completing the Data Analyst career track, featuring accreditation badges and a gold “Verified Certificate” seal.

How it WORKS

  • Lessons
  • Exercises
  • Projects
  • Practice exams
  • AI mock interviews

Lessons

Learn through short, simple lessons—no prior experience in AI or data science needed.

Try for free

Exercises

Reinforce your learning with mini recaps, hands-on coding, flashcards, fill-in-the-blank activities, and other engaging exercises.

Try for free

Projects

Tackle real-world AI and data science projects—just like those faced by industry professionals every day.

Try for free

Practice exams

Track your progress and solidify your knowledge with regular practice exams.

Try for free

AI mock interviews

Prep for interviews with real-world tasks, popular questions, and real-time feedback.

Try for free

Student REVIEWS

A collage of student testimonials from 365 Data Science learners, featuring profile photos, names, job titles, and quotes or video play icons, showcasing diverse backgrounds and successful career transitions into AI and data science roles.