Web Scraping and API Fundamentals in Python

Master web data extraction with Python: Harness Web Scraping and APIs

4 hours of content 11811 students
Start for free

What you get:

  • 4 hours of content
  • 47 Interactive exercises
  • 47 Downloadable resources
  • World-class instructor
  • Closed captions
  • Q&A support
  • Future course updates
  • Course exam
  • Certificate of achievement

Web Scraping and API Fundamentals in Python

Start for free

What you get:

  • 4 hours of content
  • 47 Interactive exercises
  • 47 Downloadable resources
  • World-class instructor
  • Closed captions
  • Q&A support
  • Future course updates
  • Course exam
  • Certificate of achievement
Start for free

What you get:

  • 4 hours of content
  • 47 Interactive exercises
  • 47 Downloadable resources
  • World-class instructor
  • Closed captions
  • Q&A support
  • Future course updates
  • Course exam
  • Certificate of achievement

What You Learn

  • Develop a solid foundation in API theory
  • Master web scraping fundamentals using Beautiful Soup to directly obtain website data
  • Get a comprehensive overview of HTML and how it can be used to develop web scraping tools
  • Acquire key technical skills in data extraction that are especially relevant in the age of AI
  • Become familiar with common issues and roadblocks when scraping web data
  • Scrape third-party data ethically and in compliance with legal standards

Top Choice of Leading Companies Worldwide

Industry leaders and professionals globally rely on this top-rated course to enhance their skills.

Course Description

Web Scraping and API Fundamentals in Python offers an introduction to the techniques of data extraction from the web. In this course, you will learn how to use one of the most powerful tools on the Internet – APIs. We will also discuss in depth how to obtain information directly from websites using the Beautiful Soup Python package. There will be a short HTML crash course for those not familiar with it. Finally, we will introduce the Requests-HTML package in order to extract dynamically generated JavaScript content.

Learn for Free

What does the course cover

1.1 What does the course cover

4 min

What is Web Scraping

1.2 What is Web Scraping

3 min

Ethics of Scraping

1.3 Ethics of Scraping

3 min

Setting up the Environment

2.1 Setting up the Environment

1 min

Installing the Necessary Packages

2.2 Installing the Necessary Packages

1 min

API overview

3.1 API overview

3 min

Curriculum

  • 1. Introduction to the course
    3 Lessons 10 Min

    In this first section, we will discuss what the course covers, why you need to learn Web Scraping and give you some notes on the ethics of scraping.

    What does the course cover
    4 min
    What is Web Scraping
    3 min
    Ethics of Scraping
    3 min
  • 2. Setting Up the Environment
    2 Lessons 2 Min

    In this part of the course, we will explain to you how to set up Python 3 and then load up Jupyter. We’ll also show you what the Anaconda Prompt is and how we use it to download and import new modules.

    Setting up the Environment Read now
    1 min
    Installing the Necessary Packages
    1 min
  • 3. Working with APIs
    15 Lessons 47 Min

    Here we will introduce what APIs are and how to use them. In order to do that, we will discuss the popular data exchange format JSON, as well as HTTP requests and the Python library to submit them – ‘requests’. At the end of the section we will show you how to deal with an API that requires a registration.

    API overview
    3 min
    HTTP requests GET and POST requests
    3 min
    JSON preferred data exchange format for APIs
    2 min
    Exchange rates API GETting a JSON response
    5 min
    Incorporating parameters in a GET request
    3 min
    Additional API functionalities
    5 min
    Creating a simple currency converter
    5 min
    iTunes API
    5 min
    Homework Read now
    1 min
    Homework - 2 Read now
    1 min
    iTunes API Structuring and exporting the data
    2 min
    GitHub API Pagination
    4 min
    EDAMAM API Initial setup and registration
    3 min
    EDAMAM API Sending a POST request
    4 min
    Downloading files with Requests Read now
    1 min
  • 4. HTML overview
    8 Lessons 38 Min

    Web Scraping relies on extracting information from the source code of webpages. Thus, a general understanding of HTML is required. This section is a short crash course for those that are not familiar with HTML. It is meant as an intuitive look into the basics, not a comprehensive guide.

    What is HTML?
    3 min
    Structure of HTML
    3 min
    Syntax of HTML. Tags
    6 min
    Tag attributes
    6 min
    Popular tags
    6 min
    CSS and JavaScript
    6 min
    Character encoding
    6 min
    XHTML and code style
    2 min
  • 5. Web Scraping with Beautiful Soup
    11 Lessons 50 Min

    After familiarizing with HTML, we are ready to delve into the Web Scraping itself. We will now introduce the “Beautiful Soup” package and explore its capabilities.

    Introduction to the Beautiful Soup package
    2 min
    Workflow of Web Scraping
    6 min
    Setting up your first scraper
    3 min
    Searching and navigating the HTML tree
    7 min
    Searching the HTML tree by attributes
    4 min
    Extracting data from the HTML tree
    3 min
    Extracting text from an HTML tag
    5 min
    Practical example dealing with links
    6 min
    Homework BeautifulSoup Section 1 Read now
    1 min
    Extracting data from nested HTML tags
    5 min
    Scraping multiple pages automatically
    8 min
  • 6. Practical project: Scraping Rotten Tomatoes
    7 Lessons 27 Min

    Now that we’ve seen what Beautiful Soup can do, we will devote this section to practicing our newly formed skills. We are going to obtain information about movies from a ‘Rotten Tomatoes’ rank list.

    Setting up your scraper
    4 min
    Extracting the title and year of each movie
    7 min
    Homework BeautifulSoup Secion 2 - Score Read now
    1 min
    Extracting the rest of the information
    6 min
    Dealing with the cast of the movies
    5 min
    Homework BeautifulSoup Secion 2 - Rest Read now
    1 min
    Storing and exporting the data in a structured form
    3 min
  • 7. Scraping HTML tables
    2 Lessons 7 Min

    In this short section, we will discuss an easy way to scrape HTML tables.

    Scraping Steam Read now
    2 min
    Scraping HTML tables with the help of Pandas
    5 min
  • 8. Common roadblocks when scraping
    1 Lesson 13 Min

    Although we have done a decent amount of scraping so far in the course, this is one of those topics that can depend very much so on the website we choose. Different websites present specific problems. Thus, in this section, we will discuss what are the most common problems that you will have to deal with and give you solutions and workarounds.

    Common roadblocks when Web Scraping
    13 min
  • 9. The requests-html package
    6 Lessons 26 Min

    Here, we will introduce another Web Scraping package – ‘Requests-HTML’. We are doing it because it has one big advantage over Beautiful Soup – the ability to execute JavaScript. Thus, this allows us to extract dynamically generated content which is exactly what we will do.

    Introduction to the requests-html package
    2 min
    Exploring the capabilities of requests-html for Web Scraping
    5 min
    Searching for text
    3 min
    CSS selectors
    9 min
    Scraping JavaScript
    6 min
    Scraping SoundCloud Read now
    1 min

Topics

PythonTheoryProgrammingJupyterAPIsWeb ScrapingHtmlBeautifulsoup

Tools & Technologies

python

Course Requirements

  • You need to complete an introduction to Python before taking this course
  • You will need to install the Anaconda package, which includes Jupyter Notebook

Who Should Take This Course?

Level of difficulty: Beginner

  • Aspiring data scientists, data analysts, data engineers, AI engineers
  • Current data scientists, data analysts, data engineers who want to learn how to extract data from the web

Exams and Certification

A 365 Data Science Course Certificate is an excellent addition to your LinkedIn profile—demonstrating your expertise and willingness to go the extra mile to accomplish your goals.

Exams and certification

Meet Your Instructor

Andrew Treadway

Andrew Treadway

Senior Machine Learning Engineer at

1 Courses

420 Reviews

11811 Students

Andrew is an experienced data scientist at one of the FAANG companies. Previously, he worked as a data scientist and a statistical programmer/analyst for a few prominent corporations. While advancing in his career, Andrew has utilized Python, R, and SQL, among other tools, to drive business decisions, automate the renewal process, and build machine learning models that assess risk and streamline inefficiencies. He also managed consultants, mentored junior data scientists, and conducted internal training sessions on Python, R, and other technical topics. In his free time, Andrew enjoys leading meetup groups on bioinformatics and open-source programming, as much as he likes writing blog posts for TheAutomatic.net. His articles appear on R-bloggers, Interactive Brokers, and R Weekly.

What Our Learners Say

07.11.2024
07.11.2024
07.11.2024

365 Data Science Is Featured at

Our top-rated courses are trusted by business worldwide.