The Machine Learning Process A-Z
trending topic

Name: The Machine Learning Process A-Z Course
Price: 36 USD

with Ken Jee and Jeff Li

4.8/5

(999)

Master the complete machine learning lifecycle: from problem definition to model deployment in production

6 hours of content 15464 students

Start for Free

What you get:

6 hours of content
27 Downloadable resources
World-class instructor
Closed captions
Q&A support
Future course updates
Course exam
Certificate of achievement

The Machine Learning Process A-Z
trending topic

A course by Ken Jee and Jeff Li

Start for Free

What you get:

6 hours of content
27 Downloadable resources
World-class instructor
Closed captions
Q&A support
Future course updates
Course exam
Certificate of achievement

$99.00

Lifetime access

Buy now

Start for Free

What you get:

6 hours of content
27 Downloadable resources
World-class instructor
Closed captions
Q&A support
Future course updates
Course exam
Certificate of achievement

What You Learn

Acquire the real-world skills needed to leverage machine learning into actual results
Master each stage of the machine learning lifecycle – from data collection and preprocessing to model deployment
Gain hands-on data preprocessing experience prepare your data for machine learning
Improve your ML model’s results through advanced feature engineering techniques
Be able to independently manage and execute a complete ML project from start to finish
Differentiate your data scientist profile by learning invaluable practical skills only experienced data scientists can teach

Top Choice of Leading Companies Worldwide

Industry leaders and professionals globally rely on this top-rated course to enhance their skills.

Course Description

Data science education focuses too much on the algorithm itself. In reality, we can have only four lines of code and use them for a variety of problems. The heavy lift of an ML model is the end-to-end process. Jeff Li and Ken Jee walk you step-by-step through this process, so you can successfully take your next project from start to finish. You will learn everything you need to know to set your projects up for success. The Machine Learning Process A-Z course gives you a deep understanding of what machine learning really is. It helps you understand when you should and shouldn’t use this powerful tool. Jeff and Ken break down the specifics of the different problems you can encounter and how machine learning is used in specific domains. In the second part of the course, you will learn the entire modeling process. Jeff and Ken show you how to pull real results and make the ML model work for others, not just yourself. You will learn how to perform essential steps like data preprocessing. In this section, they also show you how to deal with null values and outliers. Next, you’ll see how to explore your data to frame your analysis. Additionally, the course deals with some of the visualization techniques that can help you to see the relationships in your data. After that, we go into feature engineering—one of the most important steps for improving your model’s results. That leads to cross-validation and how to handle bias and variance trade-off in your analysis. Finally, the instructors touch briefly on the model tuning process and how to productionize your work and documentation.

Learn for Free

1.1 Introduction

3 min

1.2 ML Process Course - GitHub repository

1 min

1.3 Meet your instructors

1 min

1.4 How to use this course

1 min

1.5 Additional resources

1 min

1.6 Environment setup

1 min

Curriculum

1. Course Introduction

12 Lessons 21 Min

Machine learning has multiple benefits. The first is consistency—аn algorithm can evaluate hundreds of variables, find indiscernible relationships, and make the same prediction every time. Humans, on the other hand, can become overwhelmed by the amount of information and make erratic predictions. When dealing with high volumes of data, a human may come to different conclusions about predicted outcomes from the same dataset without realizing it.

Introduction
3 min
ML Process Course - GitHub repository Read now
1 min
Meet your instructors
1 min
How to use this course
1 min
Additional resources
1 min
Environment setup
1 min
Setting up Colab notebooks
1 min
Setting up notebooks locally
2 min
Setting up your flashcards - Explainer video
7 min
Setting up your flashcards - Link Read now
1 min
Why Machine Learning
1 min
Why learn the ML process?
1 min
2. Intro to Machine Learning

10 Lessons 15 Min

Learn when you should and shouldn’t use machine learning. We teach you when in your organization’s lifecycle to implement ML and what real-life problems you can solve. We also cover the types of ML problems and show you the difference between product analytics vs. data products.

Real-world ML examples
1 min
Applying ML
1 min
When is ML needed?
2 min
When do we use ML?
2 min
When to avoid using ML?
1 min
Real-life ML process
3 min
Supervised learning vs. Unsupervised learning
2 min
Regression vs. Classification
1 min
ML for Product analytics
1 min
ML for Data products
1 min
3. The Modeling Process

7 Lessons 8 Min

The most important step in machine learning is not any fancy technique—it’s properly defining the problem. An incorrect definition can result in months—or even years—of wasted resources. That is why, before we can leverage the power of ML, we need a reason to use it. Whether it’s a directive from a manager, a pain point from a stakeholder, or a gap in the business, you need to properly frame the problem first and foremost.

Problem framing
1 min
Understanding the problem
1 min
Defining the problem
1 min
Business impact
2 min
Mapping out solutions
1 min
Understanding the data
1 min
Checklist
1 min
4. Data Collection

6 Lessons 8 Min

Most projects won’t give you the data in a neat and tidy .csv file like they do in class or on Kaggle. If you work at a big company, it is likely that you will have plenty of data readily available to you. However, you’ll still need to pull it yourself. In this section, we will talk about data collection techniques from most structured to least structured.

Data collection overview
1 min
Databases
2 min
APIs
2 min
Online data capture
1 min
Web scraping
1 min
Survey
1 min
5. Data Preprocessing

13 Lessons 35 Min

After defining our problem, we need to preprocess our data. Keep in mind that different models have different requirements. For example, linear regression cannot handle null values, whereas some tree-based models can. Additionally, deviant data points can significantly skew the results of some models and not others, e.g., linear models are affected by outliers whereas tree-based models—not as much. In that case, you may want to remove or adjust outliers in order to get the best performance from our models. After all, we want to make sure our data is of the highest quality before we start building our models.

Intro
2 min
Dealing with null values
1 min
Types of null values
4 min
Missing values - Coding examples Read now
1 min
Missing values - Coding portion
7 min
Strategies for handling null values
4 min
Dealing with outliers
1 min
Outliers - Coding portion
8 min
Implications of outliers
1 min
Detecting outliers
1 min
Causes of outliers
1 min
Treating outliers
3 min
Outliers - Coding examples Read now
1 min
6. Exploratory Data Analysis for ML

15 Lessons 52 Min

Exploratory data analysis is an important prerequisite to the modeling process. With EDA, we get a feel for the assumptions so we can choose the best model and get the most accurate results. The reason we perform this step is to understand the shape of our data and the relationships between the variables.

Overview
1 min
Why you need EDA
2 min
How to approach EDA
1 min
Distributions and single variable plots
3 min
Relationships and multi variable plots
1 min
Scatterplots
1 min
Correlation
1 min
Correlation matrix
1 min
Bar charts
1 min
Line charts
1 min
Pivot table
1 min
How to generate useful insights
1 min
Basic EDA - Coding examples Read now
1 min
Basic EDA - Coding portion (1 of 2)
10 min
Basic EDA - Coding portion (2 of 2)
26 min
7. Feature Engineering

27 Lessons 44 Min

The feature engineering step will likely reap the highest accuracy gains in our modeling process—even more than the selection of the algorithm itself. The core principle is to find variables that have predictive power. So, regardless of how complex the feature is or how fancy of a technique, if it does not add predictive power to our model, then it doesn’t do anything for your model. Remember: complexity doesn’t create value. Creative and domain-specific features create value.

Intro
1 min
Categorical features
1 min
Categorical features - Coding examples Read now
1 min
Categorical features - Coding portion
10 min
One-hot encoding
1 min
Ordinal encoding
2 min
Frequency encoding
1 min
Target encoding
2 min
Probability ratio encoding
2 min
Weight of evidence encoding
2 min
Binning
1 min
Feature engineering for continuous variables
1 min
Scaling
1 min
Normalization
1 min
MinMax scaling
1 min
Z-score normalization
1 min
Robust scaler
1 min
Types of transformations
2 min
Logarithmic transformations
2 min
Exponential transformations
1 min
Square root transformations
1 min
Box-Cox transformations
1 min
Arithmetic interactions
1 min
Binning
1 min
Creative features
1 min
Continuous feature scaling - Coding examples Read now
1 min
Continuous feature scaling - Coding portion
3 min
8. Cross Validation

12 Lessons 33 Min

Cross-validation allows us to test our model on data that the model has not been trained on. If it performs well on this unseen data, we can be more confident that our predictions will generalize well.

Cross validation
1 min
Train-test-split
2 min
Overfitting and underfitting
2 min
Bias/Variance tradeoff
1 min
Bias
1 min
Variance
2 min
K-fold cross validation
2 min
Leave one out cross validation
1 min
Time series cross validation
1 min
Monte carlo cross validation
1 min
Cross validation - Coding portion
18 min
Cross validation - Coding examples Read now
1 min
9. Feature Selection

6 Lessons 18 Min

Feature selection is the process of reducing the model’s complexity by selecting only the highest signal features in the dataset. Typically, there are three classes of feature selection techniques. Wrapper methods require us to exhaustively search through all features and measure performance. They give us the best accuracy but are the most computationally expensive out of the three. Next, we have filter methods that pick up feature properties through univariate statistics. These tend to be faster and easier to check since they don’t rely on cross-validation. Lastly, the embedded methods are baked into the ML algorithm. We cover these in the companion course, Machine Learning Algorithms.

Feature selection
2 min
Feature selection - Coding examples Read now
1 min
Feature selection - Coding portion
11 min
Wrapper methods
2 min
Filter methods
1 min
Embedded methods
1 min
10. Dealing with Imbalanced Data

9 Lessons 26 Min

How do we deal with imbalance in our data? This is where oversampling, undersampling, and other techniques become useful. We show you how to perform these to optimize your model’s performance.

Intro
2 min
Random undersampling
1 min
Random oversampling
1 min
Synthetic minority oversampling
1 min
Borderline SMOTE
1 min
Safe-level SMOTE
1 min
Adaptive synthetic oversampling
2 min
Dealing with imbalanced data - Coding examples Read now
1 min
Dealing with imbalanced data - Coding portion
16 min
11. Modeling

7 Lessons 47 Min

In this section of the course, we’ll focus on ML modeling techniques such as the baseline model, model selection, hyperparameter tuning, and assembling.

Modeling
1 min
Creating a model baseline
1 min
Model selection
2 min
Parameter tuning
6 min
Ensemble models
3 min
The ML modelling process basics - Coding examples Read now
1 min
The ML modelling process basics - Coding portion
33 min
12. Model evaluation

17 Lessons 53 Min

We can’t measure the value of our model without proper evaluation criteria. Earlier in this course, we touched on two types of Supervised ML problems—regression and classification. Since these problems use different target variables, we use different metrics to evaluate the quality of the model.

Model evaluation
1 min
Classification evaluation
2 min
Classification metrics - Coding examples Read now
1 min
Precision and Recall
4 min
Accuracy, Precision, Recall - Coding portion
10 min
ROC curves
6 min
ROC AUC
4 min
F1-Score
2 min
PR-AUC
2 min
Classification metrics - Coding portion
4 min
Log loss
2 min
Regression
1 min
Regression metrics - Coding examples Read now
1 min
R-squared / Adjusted R-squared
3 min
Mean absolute error
1 min
Root mean squared error
2 min
Regression notebook - Coding portion
7 min
13. Productionization

3 Lessons 7 Min

The goal of our work is to make it useful to other people. How do we do that? Through something called productionization. It is a little broader than simply deploying the model into your product; rather it is the delivery of our model to our end user in whatever form.

Saving your models
2 min
Types of production outputs
4 min
Model maintenance
1 min
14. Conclusion

1 Lesson 6 Min

As the course goes through different steps of the modeling process, you’re probably wondering, “How am I supposed to remember each step?” The good news is that building an ML model typically follows a very similar process each time: (1) Problem Framing/Business Understanding; (2) Data Cleaning/Preparation; (3) Exploratory Data Analysis; (4) Feature Engineering; (5) Cross-Validation; (6) Modeling; (7) Evaluation

Conclusion
6 min

Topics

Machine LearningModel EvaluationData PreprocessingMachine Learning ProcessData ModelingDealing with Imbalanced DataCross ValidationFeature EngineeringExploratory Data AnalysisMachine and Deep Learning

Tools & Technologies

Course Requirements

You need to complete an introduction to Python before taking this course
Basic skills in statistics, probability, and linear algebra are required
It is highly recommended to take the Machine Learning in Python course first
Highly recommended to take the Machine Learning in Python course first
You will need to install the Anaconda package, which includes Jupyter Notebook

Who Should Take This Course?

Level of difficulty: Advanced

Aspiring data scientists and ML engineers
Existing data scientists and ML engineers who want to boost their skills and learn from world-class experts

Exams and Certification

A 365 Data Science Course Certificate is an excellent addition to your LinkedIn profile—demonstrating your expertise and willingness to go the extra mile to accomplish your goals.

Meet Your Instructor

Ken Jee

Content Creator on

4 Courses

2659 Reviews

40803 Students

Ken has held data science positions in companies of all sizes - from startups to Fortune 100 organizations. He is a Senior Data Scientist who is very passionate about content creation. Thanks to his friendly delivery style and willingness to share knowledge, Ken Jee is the perfect role model for anyone who wants to start a career in data science. We strongly recommend watching some of Ken’s most popular YouTube videos, such as “How I would Learn Data Science (If I Had to Start Over)”. If you are into learning and sharing your progress with others, you can also check out Ken’s #66DaysOfData hashtag on LinkedIn.