Tracking User Engagement with SQL, Excel, and Python Project

Comparing and Analyzing Student Engagement Between Q2 2021 and Q2 2022 advanced

With Hristina Hristova

Type: Career Track project

Duration: 25 Hours

Case Description

Background: Throughout this Tracking User Engagement with SQL, Excel, and Python project, you’ll work with a real dataset from our company’s data. The project requires you to analyze whether the new additions to the platform (new courses, exams, and career tracks) have increased student engagement.

You are given the following information:

  • Holder (student ID) and issuance date of certificates issued in Q2 2022
  • Student ID and registration date of students registered between January 1, 2020 and June 30, 2022
  • Student ID, product type, purchase date, and refund date (if applicable) of purchases made between January 1, 2020 and June 30, 2022
  • Student watching (student ID), time watched, and date of courses watched in Q2 2021 and Q2 2022

We have, of course, restrictеd the dataset volume and made sure to protect our customers’ privacy.

Hypothesis: The first half of 2022 was expected to be profitable for the company. The reason was the hypothesized increased student engagement after the release of several new features on the company’s website at end-2021. These include enrolling in career tracks and testing your knowledge through practice, course, and career track exams. Of course, we have also expanded our course library to increase user engagement and the platform’s audience as more topics are covered. By comparing different metrics, we can measure the effectiveness of these new features and the overall engagement of our users.

Guidelines: Every data scientist has their preferred methodology. Two data scientists solving a task may obtain the same result using different tools. This implies that throughout this project, analyzing the data correctly and extracting meaningful results is more important than your approach.
Nevertheless, we provide optional guidance with the tools taught in the courses from the Data Science career track.

Project requirements

For this Tracking User Engagement with SQL, Excel, and Python project, you’ll be working with MySQL Workbench 8.0 (or later), Excel 2007 (or later), and Python 3, where you’ll need to prepare the following libraries:

  • pandas
  • matplotlib
  • statsmodels
  • scikit-learn
  • seaborn (optional)

Project files

The file is an SQL database containing information on student purchases, activity, and certificate issuance.
Project content
  • 1 Project file
  • Guided and unguided instructions
  • Part 1: Data Preparation with SQL – Creating a View
  • Part 2: Data Preparation with SQL – Splitting Into Periods
  • Part 3: Data Preparation with SQL – Certificates Issued
  • Part 4: Data Preprocessing with Python – Removing Outliers
  • Part 5: Data Analysis with Excel – Hypothesis Testing
  • Part 6: Data Analysis with Excel – Correlation Coefficients
  • Part 7: Dependencies and Probabilities
  • Part 8: Data Prediction with Python
  • Quiz
Topics covered
Machine learning Relational Databases Programming Data Analysis Mathematics Data Preprocessing

