Data Preprocessing with NumPy

Name: Data Preprocessing with NumPy Course
Price: 36 USD

with Viktor Mehandzhiyski

4.8/5

(1,077)

Master Python’s key NumPy package: Apply essential techniques for efficient data preprocessing and analysis

8 hours of content 24037 students

Start for Free

What you get:

8 hours of content
55 Interactive exercises
30 Coding exercises
48 Downloadable resources
World-class instructor
Closed captions
Q&A support
Future course updates
Course exam
Certificate of achievement

Data Preprocessing with NumPy

A course by Viktor Mehandzhiyski

Start for Free

What you get:

8 hours of content
55 Interactive exercises
30 Coding exercises
48 Downloadable resources
World-class instructor
Closed captions
Q&A support
Future course updates
Course exam
Certificate of achievement

$99.00

Lifetime access

Buy now

Start for Free

What you get:

8 hours of content
55 Interactive exercises
30 Coding exercises
48 Downloadable resources
World-class instructor
Closed captions
Q&A support
Future course updates
Course exam
Certificate of achievement

What You Learn

Add the popular NumPy library to your data analysis skillset to enhance your capabilities
Learn how to install and import Python packages
Gain proficiency in using NumPy’s ndarray for slicing and dimensionality reduction, optimizing data for analysis
Explore and master different ways to clean and preprocess data in NumPy
Solve real-world data preprocessing problems with NumPy
Elevate your career with advanced NumPy skills, making your resume stand out to recruiters and hiring managers

Top Choice of Leading Companies Worldwide

Industry leaders and professionals globally rely on this top-rated course to enhance their skills.

Course Description

This course is designed to show you how to work with one of Python’s fundamental packages – NumPy. You will learn what a “package” is and see how to install, upgrade and import it. By the time you finish the course, you’ll be comfortable with NumPy’ ndarray class, how to slice and reduce the dimensions of its instances, as well as how to quickly refer to the documentation. Furthermore, you’ll be ready to take advantage of NumPy’s various built-in functions and methods, which we’ll use to generate random and non-random data, import and export data to and from Python, find statistical values for a dataset, and clean and preprocess ndarrays.

Learn for Free

1.1 Course Introduction

5 min

1.2 The NumPy Package and Its Applications

4 min

1.3 Installing and Upgrading NumPy

2 min

1.5 What is an array?

3 min

1.8 Using The NumPy Documentation

5 min

1.10 Frequently Asked Questions

1 min

Curriculum

1. Introduction to NumPy

6 Lessons 20 Min

This introductory section presents the NumPy package and its applications. You’ll learn how to install and upgrade NumPy, before quickly learning about its most important assets – “arrays”. We’ll also go over how to use the documentation - an extremely useful component for our work later on in the course.

Course Introduction
5 min
The NumPy Package and Its Applications
4 min
Installing and Upgrading NumPy
2 min
What is an array?
3 min
Using The NumPy Documentation
5 min
Frequently Asked Questions Read now
1 min
2. Why do we use NumPy?

3 Lessons 20 Min

This section follows NumPy’s role in the development of Python and takes a closer look at ndarrays. We discuss what makes them so useful and compare them to another similarly-looking data structure – NumPy lists.

History of NumPy
3 min
Ndarrays
10 min
Arrays vs Lists
7 min
3. NumPy Fundamentals

6 Lessons 29 Min

Here, we focus on the basic NumPy syntax. You’ll learn about “indexing” and the different ways of assigning values to an array. This section also explains the elementwise properties of arrays, as we go over the different types of data we can store in them. In addition, we’ll take a look at some of the most important characteristics and properties of NumPy functions.

Indexing
6 min
Assigning Values
4 min
Elementwise Properties
4 min
Types of Data Supported by NumPy
6 min
Characteristics of NumPy Functions - Part 1
5 min
Characteristics of NumPy Functions - Part 2
4 min
4. Working with Arrays

4 Lessons 27 Min

This section explores the concept of slicing and how its many variations can be applied to ndarrays. You’ll grasp what “dimensions” are when it comes to arrays and learn how the “reduce” function and method work.

Basic Slicing
10 min
Stepwise Slicing
5 min
Conditional Slicing
5 min
Dimensions and the Squeeze Function
7 min
5. Generating Data with NumPy

7 Lessons 32 Min

This part of the course explains how to generate arrays of random and non-random data. We begin by creating “empty” arrays, as well as basic arrays of 1s and 0s, before moving on to random generators. Then, we introduce NumPy’s capabilities of generating pseudo-random data pulled from a probability distribution. The section concludes with the applications of generating pseudo-random data.

Arrays of 0s and 1s
6 min
"_like" functions in NumPy
3 min
A Non-Random Sequence of Numbers
5 min
Random Generators and Seeds
5 min
Basic Random Functions in NumPy
4 min
Probability Distributions in NumPy
5 min
Applications of Random Data in NumPy
4 min
6. Importing and Saving Data with NumPy

6 Lessons 39 Min

This part of the course explains how to generate arrays of random and non-random data. We begin by creating “empty” arrays, as well as basic arrays of 1s and 0s, before moving on to random generators. Then, we introduce NumPy’s capabilities of generating pseudo-random data pulled from a probability distribution. The section concludes with the applications of generating pseudo-random data.

np.loadtxt() vs np.genfromtxt()
11 min
Simple Cleaning when Importing
7 min
String vs Object vs Numbers
7 min
np.save()
5 min
np.savez()
5 min
np.savetxt()
4 min
7. Statistics with NumPy

8 Lessons 42 Min

In this section of the course, we focus on importing and exporting, also known as saving data using the NumPy package. We discuss the differences between “np.loadtxt()” and “np.genfromtxt()” and their applications. We’ll examine NumPy’s capabilities to partially clean datasets as we import them. Later in the section, you’ll learn why you need to import a file into a specific datatype and how choosing the incorrect one can affect your results. We continue with the topic of saving ndarrays to external files where you’ll discover what N-P-Y and N-P-Z files are and when (and how) to export arrays in those formats. Finally, we provide you with a more conventional approach and showcase how to save arrays as text files.

Using Statistical Functions in NumPy
8 min
Minimal and Maximal Values in NumPy
6 min
Statistical Order Functions in NumPy
6 min
Averages and Variance in NumPy
4 min
Covariance and Correlation in NumPy
3 min
Histograms in NumPy (Part 1)
8 min
Histograms in NumPy (Part 2)
4 min
NAN Equivalent Functions in NumPy
3 min
8. Data Manipulation with NumPy

13 Lessons 95 Min

This section revolves around NumPy’s capabilities to compute important characteristics or statistics from an array. These include minimal and maximal values, various forms of averages, covariances, correlations as well as histograms. In addition, you’ll also learn about nan equivalent functions and how to use them.

Checking for Missing Values in Ndarrays
9 min
Substituting Missing Values in Ndarrays
8 min
Reshaping Ndarrays
7 min
Removing Values from Ndarrays
4 min
Sorting Ndarrays
10 min
Argument Sort in NumPy
6 min
Argument Where in NumPy
11 min
Shuffling Ndarrays
7 min
Casting Ndarrays
6 min
Striping Values from Ndarrays
5 min
Stacking Ndarrays
11 min
Concatenating Ndarrays
6 min
Finding Unique Vaules in Ndarrays
5 min
9. A Loan Data Practical Example with NumPy

15 Lessons 88 Min

In this part of the NumPy course, we explore ways to clean and preprocess data in NumPy. You’ll understand how to find and fill missing values, reshape an array, delete excess data as well as sort, shuffle and cast ndarrays. The section also explains what argument functions are and why they are so useful, and introduces ways to combining arrays by stacking and concatenating them. Finally, you’ll discover how to extract the unique values of an array and why this can be important for your analysis.

Setting Up: Introduction to the Practical Example
5 min
Setting Up: Importing the Data Set
4 min
Setting Up: Checking for Incomplete Data
5 min
Setting Up: Splitting the Dataset
5 min
Setting Up: Creating Checkpoints
3 min
Manipulating Text Data: Issue Date
5 min
Manipulating Text Data: Loan Status and Term
7 min
Manipulating Text Data: Grade and Sub Grade
9 min
Manipulating Text Data: Verification Status & URL
5 min
Manipulating Text Data: State Address
6 min
Manipulating Text Data: Converting Strings and Creating a Checkpoint
3 min
Manipulating Numeric Data: Substitute Filler Values
8 min
Manipulating Numeric Data: Currency Change – The Exchange Rate
7 min
Manipulating Numeric Data: Currency Change - From USD to EUR
8 min
Completing the Dataset:
8 min

Topics

PythonProgrammingData AnalysisData ProcessingNumpyData PreprocessingProgramming

Tools & Technologies

Course Requirements

Highly recommended to take the Intro to Python course first
You will need to install the Anaconda package, which includes Jupyter Notebook

Who Should Take This Course?

Level of difficulty: Intermediate

Aspiring data analysts, data scientists, data engineers, AI engineers
Graduate students who need Python and NumPy for their studies

Exams and Certification

A 365 Data Science Course Certificate is an excellent addition to your LinkedIn profile—demonstrating your expertise and willingness to go the extra mile to accomplish your goals.

Meet Your Instructor

Viktor Mehandzhiyski

Data Scientist at

3 Courses

3136 Reviews

67935 Students

A Hamilton College graduate, Viktor has a strong analytics background, focusing on the fields of Statistics, Econometrics, Financial Time-Series Econometrics, and Behavioral Economics. Viktor’s coding experience is rather diverse – from working with C, C++, and Python through to the more math/econ-oriented MATLAB and STATA. He has been fascinated by coding algorithms since the age of 11 and describes himself as a “Bachelor of Science and overall cool guy”. We couldn’t agree more. Some of Viktor’s personal achievements include developing a model for forecasting transfer prices of soccer players across Europe’s top divisions and Stock Market Indexes analysis on the effects of contagion on the effectiveness of international portfolio diversification.