Web Scraping and API Fundamentals in Python

Name: Web Scraping and API Fundamentals in Python Course
Price: 36 USD

with Andrew Treadway and Nikola Pulev

4.8/5

(532)

Master web data extraction with Python: Harness Web Scraping and APIs

4 hours of content 13416 students

Start for Free

What you get:

4 hours of content
47 Interactive exercises
58 Downloadable resources
World-class instructor
Closed captions
Q&A support
Future course updates
Course exam
Certificate of achievement

Web Scraping and API Fundamentals in Python

A course by Andrew Treadway and Nikola Pulev

Start for Free

What you get:

4 hours of content
47 Interactive exercises
58 Downloadable resources
World-class instructor
Closed captions
Q&A support
Future course updates
Course exam
Certificate of achievement

$99.00

Lifetime access

Buy now

Start for Free

What you get:

4 hours of content
47 Interactive exercises
58 Downloadable resources
World-class instructor
Closed captions
Q&A support
Future course updates
Course exam
Certificate of achievement

What You Learn

Develop a solid foundation in API theory
Master web scraping fundamentals using Beautiful Soup to directly obtain website data
Get a comprehensive overview of HTML and how it can be used to develop web scraping tools
Acquire key technical skills in data extraction that are especially relevant in the age of AI
Become familiar with common issues and roadblocks when scraping web data
Scrape third-party data ethically and in compliance with legal standards

Top Choice of Leading Companies Worldwide

Industry leaders and professionals globally rely on this top-rated course to enhance their skills.

Course Description

Web Scraping and API Fundamentals in Python offers an introduction to the techniques of data extraction from the web. In this course, you will learn how to use one of the most powerful tools on the Internet – APIs. We will also discuss in depth how to obtain information directly from websites using the Beautiful Soup Python package. There will be a short HTML crash course for those not familiar with it. Finally, we will introduce the Requests-HTML package in order to extract dynamically generated JavaScript content.

Learn for Free

1.1 What does the course cover

4 min

1.2 What is Web Scraping

3 min

1.4 Ethics of Scraping

3 min

2.1 Setting up the Environment

1 min

2.2 Installing the Necessary Packages

1 min

3.1 API overview

3 min

Curriculum

1. Introduction to the course

3 Lessons 10 Min

In this first section, we will discuss what the course covers, why you need to learn Web Scraping and give you some notes on the ethics of scraping.

What does the course cover
4 min
What is Web Scraping
3 min
Ethics of Scraping
3 min
2. Setting Up the Environment

2 Lessons 2 Min

In this part of the course, we will explain to you how to set up Python 3 and then load up Jupyter. We’ll also show you what the Anaconda Prompt is and how we use it to download and import new modules.

Setting up the Environment Read now
1 min
Installing the Necessary Packages
1 min
3. Working with APIs

15 Lessons 47 Min

Here we will introduce what APIs are and how to use them. In order to do that, we will discuss the popular data exchange format JSON, as well as HTTP requests and the Python library to submit them – ‘requests’. At the end of the section we will show you how to deal with an API that requires a registration.

API overview
3 min
HTTP requests GET and POST requests
3 min
JSON preferred data exchange format for APIs
2 min
Exchange rates API GETting a JSON response
5 min
Incorporating parameters in a GET request
3 min
Additional API functionalities
5 min
Creating a simple currency converter
5 min
iTunes API
5 min
Homework Read now
1 min
Homework - 2 Read now
1 min
iTunes API Structuring and exporting the data
2 min
GitHub API Pagination
4 min
EDAMAM API Initial setup and registration
3 min
EDAMAM API Sending a POST request
4 min
Downloading files with Requests Read now
1 min
4. HTML overview

8 Lessons 38 Min

Web Scraping relies on extracting information from the source code of webpages. Thus, a general understanding of HTML is required. This section is a short crash course for those that are not familiar with HTML. It is meant as an intuitive look into the basics, not a comprehensive guide.

What is HTML?
3 min
Structure of HTML
3 min
Syntax of HTML. Tags
6 min
Tag attributes
6 min
Popular tags
6 min
CSS and JavaScript
6 min
Character encoding
6 min
XHTML and code style
2 min
5. Web Scraping with Beautiful Soup

11 Lessons 50 Min

After familiarizing with HTML, we are ready to delve into the Web Scraping itself. We will now introduce the “Beautiful Soup” package and explore its capabilities.

Introduction to the Beautiful Soup package
2 min
Workflow of Web Scraping
6 min
Setting up your first scraper
3 min
Searching and navigating the HTML tree
7 min
Searching the HTML tree by attributes
4 min
Extracting data from the HTML tree
3 min
Extracting text from an HTML tag
5 min
Practical example dealing with links
6 min
Homework BeautifulSoup Section 1 Read now
1 min
Extracting data from nested HTML tags
5 min
Scraping multiple pages automatically
8 min
6. Practical project: Scraping Rotten Tomatoes

7 Lessons 27 Min

Now that we’ve seen what Beautiful Soup can do, we will devote this section to practicing our newly formed skills. We are going to obtain information about movies from a ‘Rotten Tomatoes’ rank list.

Setting up your scraper
4 min
Extracting the title and year of each movie
7 min
Homework BeautifulSoup Secion 2 - Score Read now
1 min
Extracting the rest of the information
6 min
Homework BeautifulSoup Secion 2 - Rest Read now
1 min
Dealing with the cast of the movies
5 min
Storing and exporting the data in a structured form
3 min
7. Scraping HTML tables

2 Lessons 7 Min

In this short section, we will discuss an easy way to scrape HTML tables.

Scraping HTML tables with the help of Pandas
5 min
Scraping Steam Read now
2 min
8. Common roadblocks when scraping

1 Lesson 13 Min

Although we have done a decent amount of scraping so far in the course, this is one of those topics that can depend very much so on the website we choose. Different websites present specific problems. Thus, in this section, we will discuss what are the most common problems that you will have to deal with and give you solutions and workarounds.

Common roadblocks when Web Scraping
13 min
9. The requests-html package

6 Lessons 26 Min

Here, we will introduce another Web Scraping package – ‘Requests-HTML’. We are doing it because it has one big advantage over Beautiful Soup – the ability to execute JavaScript. Thus, this allows us to extract dynamically generated content which is exactly what we will do.

Introduction to the requests-html package
2 min
Exploring the capabilities of requests-html for Web Scraping
5 min
Searching for text
3 min
CSS selectors
9 min
Scraping SoundCloud Read now
1 min
Scraping JavaScript
6 min

Topics

PythonTheoryProgrammingJupyterAPIsWeb ScrapingHTMLBeautifulsoupIndustry Specialization

Tools & Technologies

Course Requirements

You need to complete an introduction to Python before taking this course
You will need to install the Anaconda package, which includes Jupyter Notebook

Who Should Take This Course?

Level of difficulty: Beginner

Aspiring data scientists, data analysts, data engineers, AI engineers
Current data scientists, data analysts, data engineers who want to learn how to extract data from the web

Exams and Certification

A 365 Data Science Course Certificate is an excellent addition to your LinkedIn profile—demonstrating your expertise and willingness to go the extra mile to accomplish your goals.

Meet Your Instructor

Andrew Treadway

Senior Machine Learning Engineer at

1 Courses

532 Reviews

13416 Students

Andrew is an experienced data scientist at one of the FAANG companies. Previously, he worked as a data scientist and a statistical programmer/analyst for a few prominent corporations. While advancing in his career, Andrew has utilized Python, R, and SQL, among other tools, to drive business decisions, automate the renewal process, and build machine learning models that assess risk and streamline inefficiencies. He also managed consultants, mentored junior data scientists, and conducted internal training sessions on Python, R, and other technical topics. In his free time, Andrew enjoys leading meetup groups on bioinformatics and open-source programming, as much as he likes writing blog posts for TheAutomatic.net. His articles appear on R-bloggers, Interactive Brokers, and R Weekly.