Data Preprocessing with NumPy

with Viktor Mehandzhiyski
4.8/5
(904)

Master Python’s key NumPy package: Apply essential techniques for efficient data preprocessing and analysis

8 hours of content 20715 students
Start for free

What you get:

  • 8 hours of content
  • 57 Interactive exercises
  • 48 Downloadable resources
  • World-class instructor
  • Closed captions
  • Q&A support
  • Future course updates
  • Course exam
  • Certificate of achievement

Data Preprocessing with NumPy

Start for free

What you get:

  • 8 hours of content
  • 57 Interactive exercises
  • 48 Downloadable resources
  • World-class instructor
  • Closed captions
  • Q&A support
  • Future course updates
  • Course exam
  • Certificate of achievement
Start for free

What you get:

  • 8 hours of content
  • 57 Interactive exercises
  • 48 Downloadable resources
  • World-class instructor
  • Closed captions
  • Q&A support
  • Future course updates
  • Course exam
  • Certificate of achievement

What You Learn

  • Add the popular NumPy library to your data analysis skillset to enhance your capabilities
  • Learn how to install and import Python packages
  • Gain proficiency in using NumPy’s ndarray for slicing and dimensionality reduction, optimizing data for analysis
  • Explore and master different ways to clean and preprocess data in NumPy
  • Solve real-world data preprocessing problems with NumPy
  • Elevate your career with advanced NumPy skills, making your resume stand out to recruiters and hiring managers

Top Choice of Leading Companies Worldwide

Industry leaders and professionals globally rely on this top-rated course to enhance their skills.

Course Description

This course is designed to show you how to work with one of Python’s fundamental packages – NumPy. You will learn what a “package” is and see how to install, upgrade and import it. By the time you finish the course, you’ll be comfortable with NumPy’ ndarray class, how to slice and reduce the dimensions of its instances, as well as how to quickly refer to the documentation. Furthermore, you’ll be ready to take advantage of NumPy’s various built-in functions and methods, which we’ll use to generate random and non-random data, import and export data to and from Python, find statistical values for a dataset, and clean and preprocess ndarrays.

Learn for Free

Course Introduction

1.1 Course Introduction

5 min

The NumPy Package and Its Applications

1.2 The NumPy Package and Its Applications

4 min

Installing and Upgrading NumPy

1.3 Installing and Upgrading NumPy

2 min

What is an array?

1.5 What is an array?

3 min

Using The NumPy Documentation

1.8 Using The NumPy Documentation

5 min

Frequently Asked Questions

1.10 Frequently Asked Questions

1 min

Curriculum

  • 1. Introduction to NumPy
    6 Lessons 20 Min

    This introductory section presents the NumPy package and its applications. You’ll learn how to install and upgrade NumPy, before quickly learning about its most important assets – “arrays”. We’ll also go over how to use the documentation - an extremely useful component for our work later on in the course.

    Course Introduction
    5 min
    The NumPy Package and Its Applications
    4 min
    Installing and Upgrading NumPy
    2 min
    What is an array?
    3 min
    Using The NumPy Documentation
    5 min
    Frequently Asked Questions Read now
    1 min
  • 2. Why do we use NumPy?
    3 Lessons 20 Min

    This section follows NumPy’s role in the development of Python and takes a closer look at ndarrays. We discuss what makes them so useful and compare them to another similarly-looking data structure – NumPy lists.

    History of NumPy
    3 min
    Ndarrays
    10 min
    Arrays vs Lists
    7 min
  • 3. NumPy Fundamentals
    6 Lessons 29 Min

    Here, we focus on the basic NumPy syntax. You’ll learn about “indexing” and the different ways of assigning values to an array. This section also explains the elementwise properties of arrays, as we go over the different types of data we can store in them. In addition, we’ll take a look at some of the most important characteristics and properties of NumPy functions.

    Indexing
    6 min
    Assigning Values
    4 min
    Elementwise Properties
    4 min
    Types of Data Supported by NumPy
    6 min
    Characteristics of NumPy Functions - Part 1
    5 min
    Characteristics of NumPy Functions - Part 2
    4 min
  • 4. Working with Arrays
    4 Lessons 27 Min

    This section explores the concept of slicing and how its many variations can be applied to ndarrays. You’ll grasp what “dimensions” are when it comes to arrays and learn how the “reduce” function and method work.

    Basic Slicing
    10 min
    Stepwise Slicing
    5 min
    Conditional Slicing
    5 min
    Dimensions and the Squeeze Function
    7 min
  • 5. Generating Data with NumPy
    7 Lessons 32 Min

    This part of the course explains how to generate arrays of random and non-random data. We begin by creating “empty” arrays, as well as basic arrays of 1s and 0s, before moving on to random generators. Then, we introduce NumPy’s capabilities of generating pseudo-random data pulled from a probability distribution. The section concludes with the applications of generating pseudo-random data.

    Arrays of 0s and 1s
    6 min
    "_like" functions in NumPy
    3 min
    A Non-Random Sequence of Numbers
    5 min
    Random Generators and Seeds
    5 min
    Basic Random Functions in NumPy
    4 min
    Probability Distributions in NumPy
    5 min
    Applications of Random Data in NumPy
    4 min
  • 6. Importing and Saving Data with NumPy
    6 Lessons 39 Min

    This part of the course explains how to generate arrays of random and non-random data. We begin by creating “empty” arrays, as well as basic arrays of 1s and 0s, before moving on to random generators. Then, we introduce NumPy’s capabilities of generating pseudo-random data pulled from a probability distribution. The section concludes with the applications of generating pseudo-random data.

    np.loadtxt() vs np.genfromtxt()
    11 min
    Simple Cleaning when Importing
    7 min
    String vs Object vs Numbers
    7 min
    np.save()
    5 min
    np.savez()
    5 min
    np.savetxt()
    4 min
  • 7. Statistics with NumPy
    8 Lessons 42 Min

    In this section of the course, we focus on importing and exporting, also known as saving data using the NumPy package. We discuss the differences between “np.loadtxt()” and “np.genfromtxt()” and their applications. We’ll examine NumPy’s capabilities to partially clean datasets as we import them. Later in the section, you’ll learn why you need to import a file into a specific datatype and how choosing the incorrect one can affect your results. We continue with the topic of saving ndarrays to external files where you’ll discover what N-P-Y and N-P-Z files are and when (and how) to export arrays in those formats. Finally, we provide you with a more conventional approach and showcase how to save arrays as text files.

    Using Statistical Functions in NumPy
    8 min
    Minimal and Maximal Values in NumPy
    6 min
    Statistical Order Functions in NumPy
    6 min
    Averages and Variance in NumPy
    4 min
    Covariance and Correlation in NumPy
    3 min
    Histograms in NumPy (Part 1)
    8 min
    Histograms in NumPy (Part 2)
    4 min
    NAN Equivalent Functions in NumPy
    3 min
  • 8. Data Manipulation with NumPy
    13 Lessons 95 Min

    This section revolves around NumPy’s capabilities to compute important characteristics or statistics from an array. These include minimal and maximal values, various forms of averages, covariances, correlations as well as histograms. In addition, you’ll also learn about nan equivalent functions and how to use them.

    Checking for Missing Values in Ndarrays
    9 min
    Substituting Missing Values in Ndarrays
    8 min
    Reshaping Ndarrays
    7 min
    Removing Values from Ndarrays
    4 min
    Sorting Ndarrays
    10 min
    Argument Sort in NumPy
    6 min
    Argument Where in NumPy
    11 min
    Shuffling Ndarrays
    7 min
    Casting Ndarrays
    6 min
    Striping Values from Ndarrays
    5 min
    Stacking Ndarrays
    11 min
    Concatenating Ndarrays
    6 min
    Finding Unique Vaules in Ndarrays
    5 min
  • 9. A Loan Data Practical Example with NumPy
    15 Lessons 88 Min

    In this part of the NumPy course, we explore ways to clean and preprocess data in NumPy. You’ll understand how to find and fill missing values, reshape an array, delete excess data as well as sort, shuffle and cast ndarrays. The section also explains what argument functions are and why they are so useful, and introduces ways to combining arrays by stacking and concatenating them. Finally, you’ll discover how to extract the unique values of an array and why this can be important for your analysis.

    Setting Up: Introduction to the Practical Example
    5 min
    Setting Up: Importing the Data Set
    4 min
    Setting Up: Checking for Incomplete Data
    5 min
    Setting Up: Splitting the Dataset
    5 min
    Setting Up: Creating Checkpoints
    3 min
    Manipulating Text Data: Issue Date
    5 min
    Manipulating Text Data: Loan Status and Term
    7 min
    Manipulating Text Data: Grade and Sub Grade
    9 min
    Manipulating Text Data: Verification Status & URL
    5 min
    Manipulating Text Data: State Address
    6 min
    Manipulating Text Data: Converting Strings and Creating a Checkpoint
    3 min
    Manipulating Numeric Data: Substitute Filler Values
    8 min
    Manipulating Numeric Data: Currency Change – The Exchange Rate
    7 min
    Manipulating Numeric Data: Currency Change - From USD to EUR
    8 min
    Completing the Dataset:
    8 min

Topics

PythonProgrammingdata analysisData processingNumpy

Tools & Technologies

python

Course Requirements

  • Highly recommended to take the Intro to Python course first
  • You will need to install the Anaconda package, which includes Jupyter Notebook

Who Should Take This Course?

Level of difficulty: Intermediate

  • Aspiring data analysts, data scientists, data engineers, AI engineers
  • Graduate students who need Python and NumPy for their studies

Exams and Certification

A 365 Data Science Course Certificate is an excellent addition to your LinkedIn profile—demonstrating your expertise and willingness to go the extra mile to accomplish your goals.

Exams and certification

Meet Your Instructor

Viktor Mehandzhiyski

Viktor Mehandzhiyski

Data Scientist at

3 Courses

2562 Reviews

59130 Students

A Hamilton College graduate, Viktor has a strong analytics background, focusing on the fields of Statistics, Econometrics, Financial Time-Series Econometrics, and Behavioral Economics. Viktor’s coding experience is rather diverse – from working with C, C++, and Python through to the more math/econ-oriented MATLAB and STATA. He has been fascinated by coding algorithms since the age of 11 and describes himself as a “Bachelor of Science and overall cool guy”. We couldn’t agree more. Some of Viktor’s personal achievements include developing a model for forecasting transfer prices of soccer players across Europe’s top divisions and Stock Market Indexes analysis on the effects of contagion on the effectiveness of international portfolio diversification.

What Our Learners Say

365 Data Science Is Featured at

Our top-rated courses are trusted by business worldwide.

Recommended Courses