Web Scraping and API Fundamentals in Python
Web Scraping and API Fundamentals in Python offers an introduction to the techniques of data extraction from the web. In this course, you will learn how to use one of the most powerful tools on the Internet â APIs. We will also discuss in depth how to obtain information directly from websites using the BeautifulSoup Python package. There will be a short HTML crash course for those not familiar with it. Finally, we will introduce the Requests-HTML package in order to extract dynamically generated JavaScript content.
Sign up to
preview the program
for FREE!
Create a free account and start learning data science today.
create free accountOur graduates work at exciting places:
Section 1
Course Introduction
In this first section, we will discuss what the course covers, why you need to learn Web Scraping and give you some notes on the ethics of scraping.
Section 2
Setting Up the Environment
In this part of the course, we will explain to you how to set up Python 3 and then load up Jupyter. Weâll also show you what the Anaconda Prompt is and how we use it to download and import new modules.
Section 3
Working with APIs
Here we will introduce what APIs are and how to use them. In order to do that, we will discuss the popular data exchange format JSON, as well as HTTP requests and the Python library to submit them â ârequestsâ. At the end of the section, we will show you how to deal with an API that requires registration.
Section 4
HTML Overview
Web Scraping relies on extracting information from the source code of webpages. Thus, a general understanding of HTML is required. This section is a short crash course for those that are not familiar with HTML. It is meant as an intuitive look into the basics, not a comprehensive guide.
Section 5
Web Scraping with Beautiful Soup
After familiarizing with HTML, we are ready to delve into the Web Scraping itself. We will now introduce the âBeautiful Soupâ package and explore its capabilities.
Section 6
Practical Project: Scraping Rotten Tomatoes
Now that weâve seen what Beautiful Soup can do, we will devote this section to practicing our newly formed skills. We are going to obtain information about movies from a âRotten Tomatoesâ rank list.
Section 7
Scraping HTML Tables
In this short section, we will discuss an easy way to scrape HTML tables.
Section 8
Common Roadblocks when Scraping
Although we have done a decent amount of scraping so far in the course, this is one of those topics that can depend very much so on the website we choose. Different websites present specific problems. Thus, in this section, we will discuss what are the most common problems that you will have to deal with and give you solutions and workarounds.
Section 9
The Requests-HTML Package
Here, we will introduce another Web Scraping package â âRequests-HTMLâ. We are doing it because it has one big advantage over Beautiful Soup â the ability to execute JavaScript. Thus, this allows us to extract dynamically generated content which is exactly what we will do.
Advanced Specialization
This course is part of Module 4 of the 365 Data Science Program. The complete training consists of four modules, each building upon your knowledge from the previous one. Module 4 is focused on developing a specialized, industry-relevant skill set, and students are encouraged to complete Modules 1, 2, and 3 before they start this part of the training. Here, you will learn how to perform Credit Risk Modeling for banks, Customer Analytics for retail or other commercial companies, and Time Series Analysis for finance and stock data.
See All ModulesWhy Choose the 365 Data Science Program?
Practice
Real-life project and data. Solve them on your own computer as you would in the office.
Q&A Hub
Our expert instructors are happy to help. Post a question and get a personal answer by one of our instructors.
Certificates
Earn a verifiable certificate after each completed course. Celebrate your successes and share your progress with your professional network!