Last answered:

26 Jun 2025

Posted on:

25 Jun 2025

0

Project Question: Calculating Free to Paid Conversion Rate - Subquery

Hey team,

I'm working on the subquery for this one. Could you help me without telling me the answer please? It says I'm supposed to end up with 20,255 in my result set. But once I filter to only the students that have watched BEFORE purchasing, my set is way smaller: 2,286.

Any thoughts on what I'm doing wrong?

Here's what I wrote:

SELECT
    si.student_id,
    si.date_registered,
    MIN(se.date_watched) as first_date_watched,
    MIN(sp.date_purchased) as first_date_purchased,
    DATEDIFF(si.date_registered, MIN(se.date_watched)) as date_diff_reg_watch,
    DATEDIFF(MIN(se.date_watched), MIN(sp.date_purchased)) as date_diff_watch_purch
FROM
    student_info si
JOIN
    student_engagement se ON si.student_id = se.student_id
LEFT JOIN
    student_purchases sp ON si.student_id = sp.student_id
GROUP BY si.student_id
HAVING
    MIN(se.date_watched) <= MIN(sp.date_purchased);
 
2 answers ( 0 marked as helpful)
Posted on:

26 Jun 2025

0
This is the part that really throws me. I think I must be interpretting the objective wrong:

The resulting set you retrieve should include the student IDs of students entering the diagram’s shaded region. Additionally, your objective is to determine the conversion rate of students who have already watched a lecture. Therefore, filter your result dataset so that the date of first-time engagement comes before (or is equal to) the date of first-time purchase.

Sanity check: The number of records in the resulting set should be 20,255.

When I count the number of distinct students who have made a purchase in the students_purchased table, the result is 3,138. So if I attempt to filter my result dataset so that the date of first time engagement comes before the date of first time purchase, it implies that someone MUST have made a purchase to be included in that set. But since there are only 3,138 students who made a purchase, at max I can only have 3,138 results in that dataset.

So how do I end up with 20,255 records in result dataset as mentioned in the couse objective above? I don't understand. I don't want to be counting duplicate students who have made a purchase do I? I don't think that makes much sense for the objective of calculating a free to paid conversion rate. Wouldn't I only want to count the 3,138 distinct students who actually made a purchase?

Thanks for helping me understand where I'm going wrong.

-Justin
Posted on:

26 Jun 2025

0
Okay I figured it out. I needed to keep the rows in first_date_purchased if the value is NULL. I was filtering out those NULL values previously. If I keep the NULL values I get 20,255 results.

I'm not sure why I would want to keep the NULL values in first_date_purchased if I'm only concerned with calculating the conversion rate from engaged to purchased, but at least I got the number that i'm supossed to get.

Submit an answer