The Machine Learning Process A-Z trending topic

with Ken Jee and Jeff Li
4.8/5
(918)

Master the complete machine learning lifecycle: from problem definition to model deployment in production

6 hours of content 14004 students
Start for free

What you get:

  • 6 hours of content
  • 27 Downloadable resources
  • World-class instructor
  • Closed captions
  • Q&A support
  • Future course updates
  • Course exam
  • Certificate of achievement

The Machine Learning Process A-Z trending topic

A course by Ken Jee and Jeff Li
Start for free

What you get:

  • 6 hours of content
  • 27 Downloadable resources
  • World-class instructor
  • Closed captions
  • Q&A support
  • Future course updates
  • Course exam
  • Certificate of achievement
Start for free

What you get:

  • 6 hours of content
  • 27 Downloadable resources
  • World-class instructor
  • Closed captions
  • Q&A support
  • Future course updates
  • Course exam
  • Certificate of achievement

What You Learn

  • Acquire the real-world skills needed to leverage machine learning into actual results
  • Master each stage of the machine learning lifecycle – from data collection and preprocessing to model deployment
  • Gain hands-on data preprocessing experience prepare your data for machine learning
  • Improve your ML model’s results through advanced feature engineering techniques
  • Be able to independently manage and execute a complete ML project from start to finish
  • Differentiate your data scientist profile by learning invaluable practical skills only experienced data scientists can teach

Top Choice of Leading Companies Worldwide

Industry leaders and professionals globally rely on this top-rated course to enhance their skills.

Course Description

Data science education focuses too much on the algorithm itself. In reality, we can have only four lines of code and use them for a variety of problems. The heavy lift of an ML model is the end-to-end process. Jeff Li and Ken Jee walk you step-by-step through this process, so you can successfully take your next project from start to finish. You will learn everything you need to know to set your projects up for success. The Machine Learning Process A-Z course gives you a deep understanding of what machine learning really is. It helps you understand when you should and shouldn’t use this powerful tool. Jeff and Ken break down the specifics of the different problems you can encounter and how machine learning is used in specific domains. In the second part of the course, you will learn the entire modeling process. Jeff and Ken show you how to pull real results and make the ML model work for others, not just yourself. You will learn how to perform essential steps like data preprocessing. In this section, they also show you how to deal with null values and outliers. Next, you’ll see how to explore your data to frame your analysis. Additionally, the course deals with some of the visualization techniques that can help you to see the relationships in your data. After that, we go into feature engineering—one of the most important steps for improving your model’s results. That leads to cross-validation and how to handle bias and variance trade-off in your analysis. Finally, the instructors touch briefly on the model tuning process and how to productionize your work and documentation.

Learn for Free

Introduction

1.1 Introduction

3 min

ML Process Course - GitHub repository

1.2 ML Process Course - GitHub repository

1 min

Meet your instructors

1.3 Meet your instructors

1 min

How to use this course

1.4 How to use this course

1 min

Additional resources

1.5 Additional resources

1 min

Environment setup

1.6 Environment setup

1 min

Curriculum

  • 1. Course Introduction
    12 Lessons 21 Min

    Machine learning has multiple benefits. The first is consistency—аn algorithm can evaluate hundreds of variables, find indiscernible relationships, and make the same prediction every time. Humans, on the other hand, can become overwhelmed by the amount of information and make erratic predictions. When dealing with high volumes of data, a human may come to different conclusions about predicted outcomes from the same dataset without realizing it.

    Introduction
    3 min
    ML Process Course - GitHub repository Read now
    1 min
    Meet your instructors
    1 min
    How to use this course
    1 min
    Additional resources
    1 min
    Environment setup
    1 min
    Setting up Colab notebooks
    1 min
    Setting up notebooks locally
    2 min
    Setting up your flashcards - Explainer video
    7 min
    Setting up your flashcards - Link Read now
    1 min
    Why Machine Learning
    1 min
    Why learn the ML process?
    1 min
  • 2. Intro to Machine Learning
    10 Lessons 15 Min

    Learn when you should and shouldn’t use machine learning. We teach you when in your organization’s lifecycle to implement ML and what real-life problems you can solve. We also cover the types of ML problems and show you the difference between product analytics vs. data products.

    Real-world ML examples
    1 min
    Applying ML
    1 min
    When is ML needed?
    2 min
    When do we use ML?
    2 min
    When to avoid using ML?
    1 min
    Real-life ML process
    3 min
    Supervised learning vs. Unsupervised learning
    2 min
    Regression vs. Classification
    1 min
    ML for Product analytics
    1 min
    ML for Data products
    1 min
  • 3. The Modeling Process
    7 Lessons 8 Min

    The most important step in machine learning is not any fancy technique—it’s properly defining the problem. An incorrect definition can result in months—or even years—of wasted resources. That is why, before we can leverage the power of ML, we need a reason to use it. Whether it’s a directive from a manager, a pain point from a stakeholder, or a gap in the business, you need to properly frame the problem first and foremost.

    Problem framing
    1 min
    Understanding the problem
    1 min
    Defining the problem
    1 min
    Business impact
    2 min
    Mapping out solutions
    1 min
    Understanding the data
    1 min
    Checklist
    1 min
  • 4. Data Collection
    6 Lessons 8 Min

    Most projects won’t give you the data in a neat and tidy .csv file like they do in class or on Kaggle. If you work at a big company, it is likely that you will have plenty of data readily available to you. However, you’ll still need to pull it yourself. In this section, we will talk about data collection techniques from most structured to least structured.

    Data collection overview
    1 min
    Databases
    2 min
    APIs
    2 min
    Online data capture
    1 min
    Web scraping
    1 min
    Survey
    1 min
  • 5. Data Preprocessing
    13 Lessons 35 Min

    After defining our problem, we need to preprocess our data. Keep in mind that different models have different requirements. For example, linear regression cannot handle null values, whereas some tree-based models can. Additionally, deviant data points can significantly skew the results of some models and not others, e.g., linear models are affected by outliers whereas tree-based models—not as much. In that case, you may want to remove or adjust outliers in order to get the best performance from our models. After all, we want to make sure our data is of the highest quality before we start building our models.

    Intro
    2 min
    Dealing with null values
    1 min
    Types of null values
    4 min
    Missing values - Coding examples Read now
    1 min
    Missing values - Coding portion
    7 min
    Strategies for handling null values
    4 min
    Dealing with outliers
    1 min
    Outliers - Coding portion
    8 min
    Implications of outliers
    1 min
    Detecting outliers
    1 min
    Causes of outliers
    1 min
    Treating outliers
    3 min
    Outliers - Coding examples Read now
    1 min
  • 6. Exploratory Data Analysis for ML
    15 Lessons 52 Min

    Exploratory data analysis is an important prerequisite to the modeling process. With EDA, we get a feel for the assumptions so we can choose the best model and get the most accurate results. The reason we perform this step is to understand the shape of our data and the relationships between the variables.

    Overview
    1 min
    Why you need EDA
    2 min
    How to approach EDA
    1 min
    Distributions and single variable plots
    3 min
    Relationships and multi variable plots
    1 min
    Scatterplots
    1 min
    Correlation
    1 min
    Correlation matrix
    1 min
    Bar charts
    1 min
    Line charts
    1 min
    Pivot table
    1 min
    How to generate useful insights
    1 min
    Basic EDA - Coding examples Read now
    1 min
    Basic EDA - Coding portion (1 of 2)
    10 min
    Basic EDA - Coding portion (2 of 2)
    26 min
  • 7. Feature Engineering
    27 Lessons 44 Min

    The feature engineering step will likely reap the highest accuracy gains in our modeling process—even more than the selection of the algorithm itself. The core principle is to find variables that have predictive power. So, regardless of how complex the feature is or how fancy of a technique, if it does not add predictive power to our model, then it doesn’t do anything for your model. Remember: complexity doesn’t create value. Creative and domain-specific features create value.

    Intro
    1 min
    Categorical features
    1 min
    Categorical features - Coding examples Read now
    1 min
    Categorical features - Coding portion
    10 min
    One-hot encoding
    1 min
    Ordinal encoding
    2 min
    Frequency encoding
    1 min
    Target encoding
    2 min
    Probability ratio encoding
    2 min
    Weight of evidence encoding
    2 min
    Binning
    1 min
    Feature engineering for continuous variables
    1 min
    Scaling
    1 min
    Normalization
    1 min
    MinMax scaling
    1 min
    Z-score normalization
    1 min
    Robust scaler
    1 min
    Types of transformations
    2 min
    Logarithmic transformations
    2 min
    Exponential transformations
    1 min
    Square root transformations
    1 min
    Box-Cox transformations
    1 min
    Arithmetic interactions
    1 min
    Binning
    1 min
    Creative features
    1 min
    Continuous feature scaling - Coding examples Read now
    1 min
    Continuous feature scaling - Coding portion
    3 min
  • 8. Cross Validation
    12 Lessons 33 Min

    Cross-validation allows us to test our model on data that the model has not been trained on. If it performs well on this unseen data, we can be more confident that our predictions will generalize well.

    Cross validation
    1 min
    Train-test-split
    2 min
    Overfitting and underfitting
    2 min
    Bias/Variance tradeoff
    1 min
    Bias
    1 min
    Variance
    2 min
    K-fold cross validation
    2 min
    Leave one out cross validation
    1 min
    Time series cross validation
    1 min
    Monte carlo cross validation
    1 min
    Cross validation - Coding portion
    18 min
    Cross validation - Coding examples Read now
    1 min
  • 9. Feature Selection
    6 Lessons 18 Min

    Feature selection is the process of reducing the model’s complexity by selecting only the highest signal features in the dataset. Typically, there are three classes of feature selection techniques. Wrapper methods require us to exhaustively search through all features and measure performance. They give us the best accuracy but are the most computationally expensive out of the three. Next, we have filter methods that pick up feature properties through univariate statistics. These tend to be faster and easier to check since they don’t rely on cross-validation. Lastly, the embedded methods are baked into the ML algorithm. We cover these in the companion course, Machine Learning Algorithms.

    Feature selection
    2 min
    Feature selection - Coding examples Read now
    1 min
    Feature selection - Coding portion
    11 min
    Wrapper methods
    2 min
    Filter methods
    1 min
    Embedded methods
    1 min
  • 10. Dealing with Imbalanced Data
    9 Lessons 26 Min

    How do we deal with imbalance in our data? This is where oversampling, undersampling, and other techniques become useful. We show you how to perform these to optimize your model’s performance.

    Intro
    2 min
    Random undersampling
    1 min
    Random oversampling
    1 min
    Synthetic minority oversampling
    1 min
    Borderline SMOTE
    1 min
    Safe-level SMOTE
    1 min
    Adaptive synthetic oversampling
    2 min
    Dealing with imbalanced data - Coding examples Read now
    1 min
    Dealing with imbalanced data - Coding portion
    16 min
  • 11. Modeling
    7 Lessons 47 Min

    In this section of the course, we’ll focus on ML modeling techniques such as the baseline model, model selection, hyperparameter tuning, and assembling.

    Modeling
    1 min
    Creating a model baseline
    1 min
    Model selection
    2 min
    Parameter tuning
    6 min
    Ensemble models
    3 min
    The ML modelling process basics - Coding examples Read now
    1 min
    The ML modelling process basics - Coding portion
    33 min
  • 12. Model evaluation
    17 Lessons 53 Min

    We can’t measure the value of our model without proper evaluation criteria. Earlier in this course, we touched on two types of Supervised ML problems—regression and classification. Since these problems use different target variables, we use different metrics to evaluate the quality of the model.

    Model evaluation
    1 min
    Classification evaluation
    2 min
    Classification metrics - Coding examples Read now
    1 min
    Precision and Recall
    4 min
    Accuracy, Precision, Recall - Coding portion
    10 min
    ROC curves
    6 min
    ROC AUC
    4 min
    F1-Score
    2 min
    PR-AUC
    2 min
    Classification metrics - Coding portion
    4 min
    Log loss
    2 min
    Regression
    1 min
    Regression metrics - Coding examples Read now
    1 min
    R-squared / Adjusted R-squared
    3 min
    Mean absolute error
    1 min
    Root mean squared error
    2 min
    Regression notebook - Coding portion
    7 min
  • 13. Productionization
    3 Lessons 7 Min

    The goal of our work is to make it useful to other people. How do we do that? Through something called productionization. It is a little broader than simply deploying the model into your product; rather it is the delivery of our model to our end user in whatever form.

    Saving your models
    2 min
    Types of production outputs
    4 min
    Model maintenance
    1 min
  • 14. Conclusion
    1 Lesson 6 Min

    As the course goes through different steps of the modeling process, you’re probably wondering, “How am I supposed to remember each step?” The good news is that building an ML model typically follows a very similar process each time: (1) Problem Framing/Business Understanding; (2) Data Cleaning/Preparation; (3) Exploratory Data Analysis; (4) Feature Engineering; (5) Cross-Validation; (6) Modeling; (7) Evaluation

    Conclusion
    6 min

Topics

machine learningModel Evaluationdata preprocessingMachine Learning ProcessData modelingDealing with Imbalanced DataCross ValidationFeature EngineeringExploratory Data Analysis

Tools & Technologies

python

Course Requirements

  • You need to complete an introduction to Python before taking this course
  • Basic skills in statistics, probability, and linear algebra are required
  • It is highly recommended to take the Machine Learning in Python course first
  • Highly recommended to take the Machine Learning in Python course first
  • You will need to install the Anaconda package, which includes Jupyter Notebook

Who Should Take This Course?

Level of difficulty: Advanced

  • Aspiring data scientists and ML engineers
  • Existing data scientists and ML engineers who want to boost their skills and learn from world-class experts

Exams and Certification

A 365 Data Science Course Certificate is an excellent addition to your LinkedIn profile—demonstrating your expertise and willingness to go the extra mile to accomplish your goals.

Exams and certification

Meet Your Instructor

Ken Jee

Ken Jee

Content Creator on

4 Courses

2482 Reviews

37746 Students

Ken has held data science positions in companies of all sizes - from startups to Fortune 100 organizations. He is a Senior Data Scientist who is very passionate about content creation. Thanks to his friendly delivery style and willingness to share knowledge, Ken Jee is the perfect role model for anyone who wants to start a career in data science. We strongly recommend watching some of Ken’s most popular YouTube videos, such as “How I would Learn Data Science (If I Had to Start Over)”. If you are into learning and sharing your progress with others, you can also check out Ken’s #66DaysOfData hashtag on LinkedIn.

What Our Learners Say

365 Data Science Is Featured at

Our top-rated courses are trusted by business worldwide.