15.08.2022
This course has amazing visualizations while teaching, that I have never seen like this before! This is one of the very exciting experiences I ever had. I hope every other course of "365datascience" would be like this...!
Introducing you to the exciting topic of machine learning with the K-nearest neighbors algorithm using Python’s scikit-learn library.
Practical knowledge of various machine learning algorithms is essential for machine learning enthusiasts and experts alike. In this course, we focus extensively on one of the most intuitive and easy-to-implement ML algorithms out there – K-nearest neighbors, or KNN for short. Step by step, we will first lay the foundations and expand your mathematical toolbox. Then, you will progress to coding and using Python’s scikit-learn library to solve a randomly generated classification problem. Finally, you will apply KNN to a couple of regression tasks. In other words – you will learn all the subtleties that should be considered when applying the KNN algorithm in your future practice.
Aiming to upgrade your machine learning skills? This course will help you:
In this introductory section, we motivate the usage of a K-Nearest Neighbor classifier and give 2 intuitive examples. Following that is a short math refresher where we talk about the various ways in which the distance between 2 points in space can be defined. This would later serve us well when building our KNN model in Python.
What does the course cover? Free Motivation Free Math Prerequisites: Distance Metrics FreeIn advance of the hands-on part of the course, this section guides you through the installation process of relevant Python packages.
Setting up the Environment Free Installing the Relevant Packages FreeTo apply your skills in practice, you will first learn how to generate a random set of points, distribute them into 3 classes, and place them on the coordinate system. We will then use this dataset to train and test a KNN classification algorithm with the help of Python’s scikit-learn library. We will look at some edge cases that can arise during the classification process and discover how to handle them. Next, we will guide you through the process of building the so-called decision regions, which are a great way of visualizing the performance of your model. Finally, we will find out how to choose the best model parameters using a technique called ‘grid search’.
Random Dataset: Generating the Dataset Free Random Dataset: Visualizing the Dataset Free Random Dataset: Classification Free Random Dataset: How to Break a Tie Random Dataset: Decision Regions Random Dataset: Choosing the Best K-value Random Dataset: Grid Search Random Dataset: Model PerformanceContinuing the practical part of the course, we will dive into solving regression tasks using the K-Nearest Neighbors method. Similar to what we did in the previous section, we will tackle this problem by generating 2 random datasets. One would represent a linear problem, while the other would be non-linear. We will apply a linear (parametric) model and a KNN (non-parametric model) on both datasets and argue which one performs better.
Theory with a Practical Example KNN vs Linear Regression: A Linear Problem KNN vs Linear Regression: A Non-linear ProblemIn this final section of the course, the pros and cons of the KNN algorithm are discussed at length. We will study this method’s limitations, together with its strong sides.
Pros and Conswith Hristina Hristova