# Data Preprocessing with NumPy

6

Lessons68

Quizzes2

Assignments8

Course description

This course is designed to show you how to work with one of Python’s fundamental packages – NumPy. You will learn what a “package” is and see how to install, upgrade and import it. By the time you finish the course, you’ll be comfortable with NumPy’ ndarray class, how to slice and reduce the dimensions of its instances, as well as how to quickly refer to the documentation. Furthermore, you’ll be ready to take advantage of NumPy’s various built-in functions and methods, which we’ll use to generate random and non-random data, import and export data to and from Python, find statistical values for a dataset, and clean and preprocess ndarrays.## Introduction to NumPy

This introductory section presents the NumPy package and its applications. You’ll learn how to install and upgrade NumPy, before quickly learning about its most important assets – “arrays”. We’ll also go over how to use the documentation - an extremely useful component for our work later on in the course.

## Why do we use NumPy?

This section follows NumPy’s role in the development of Python and takes a closer look at ndarrays. We discuss what makes them so useful and compare them to another similarly-looking data structure – NumPy lists.

## NumPy Fundamentals

Here, we focus on the basic NumPy syntax. You’ll learn about “indexing” and the different ways of assigning values to an array. This section also explains the elementwise properties of arrays, as we go over the different types of data we can store in them. In addition, we’ll take a look at some of the most important characteristics and properties of NumPy functions.

## Working with Arrays

This section explores the concept of slicing and how its many variations can be applied to ndarrays. You’ll grasp what “dimensions” are when it comes to arrays and learn how the “reduce” function and method work.

## Generating Data with NumPy

This part of the course explains how to generate arrays of random and non-random data. We begin by creating “empty” arrays, as well as basic arrays of 1s and 0s, before moving on to random generators. Then, we introduce NumPy’s capabilities of generating pseudo-random data pulled from a probability distribution. The section concludes with the applications of generating pseudo-random data.

## Importing and Saving Data with NumPy

This part of the course explains how to generate arrays of random and non-random data. We begin by creating “empty” arrays, as well as basic arrays of 1s and 0s, before moving on to random generators. Then, we introduce NumPy’s capabilities of generating pseudo-random data pulled from a probability distribution. The section concludes with the applications of generating pseudo-random data.

## Statistics with NumPy

In this section of the course, we focus on importing and exporting, also known as saving data using the NumPy package. We discuss the differences between “np.loadtxt()” and “np.genfromtxt()” and their applications. We’ll examine NumPy’s capabilities to partially clean datasets as we import them. Later in the section, you’ll learn why you need to import a file into a specific datatype and how choosing the incorrect one can affect your results. We continue with the topic of saving ndarrays to external files where you’ll discover what N-P-Y and N-P-Z files are and when (and how) to export arrays in those formats. Finally, we provide you with a more conventional approach and showcase how to save arrays as text files.

## Data Manipulation with NumPy

This section revolves around NumPy’s capabilities to compute important characteristics or statistics from an array. These include minimal and maximal values, various forms of averages, covariances, correlations as well as histograms. In addition, you’ll also learn about nan equivalent functions and how to use them.

## A Loan Data Practical Example with NumPy

In this part of the NumPy course, we explore ways to clean and preprocess data in NumPy. You’ll understand how to find and fill missing values, reshape an array, delete excess data as well as sort, shuffle and cast ndarrays. The section also explains what argument functions are and why they are so useful, and introduces ways to combining arrays by stacking and concatenating them. Finally, you’ll discover how to extract the unique values of an array and why this can be important for your analysis.