Contrary to what many business stakeholders believe, a data scientist isn’t a magician who can use AI to fix any problem. Their primary purpose isn’t to pull data from the database either.
So, what does a data scientist do?
There are plenty of tips online on becoming a data scientist, including the required skills and education and the expected salaries in the field. But there isn’t nearly as much content that describes what data scientists do on the job.
And the easiest way to understand that is with an example. So, we walk you through the daily tasks of a data scientist working at a fictional audiobook company.
Read on for the full case study and watch the video below for a concise overview.
What Does a Data Scientist Actually Do: Table of Contents
- Data Scientist Job Description
- What Does a Data Scientist Do: Case Study
- Role Overlap
- What Does a Data Scientist Do: Next Steps
Data Scientist Job Description
Before we dive into the example, we briefly explain what data science is used for in business settings and what skills are required to succeed in the field.
The main purpose of data scientists is to create business value through informed decision-making and effective product management. The role encompasses different activities, many of which we describe throughout the example below.
Therefore—besides solid statistics knowledge and an ability to use coding languages, such as Python, R, and SQL—data scientists need business and communication skills to be successful in their work.
But let’s get to the point. What do data scientists do on the job?
What Does a Data Scientist Do: Case Study
Audimax is a company that operates an audiobook platform. Their business model provides a free trial for non-paying users with access to up to 30 minutes of daily content.
Ken, a digital marketing manager, approaches Greta, one of Audimax’s data scientists, with a request to build a machine learning model that helps convert more free users to paid subscribers.
Translating Business Goals Into a Data Strategy
To become a data scientist, you must learn how to translate business goals into a clearly defined data strategy.
Ken’s request is not a clearly defined problem, but data scientists may often find themselves in situations of this kind. Luckily, Greta knows what to do.
There are many possible ways to improve the free-to-paid conversion rate of a company. She tries to find the right approach by talking to employees from several departments. This is where her communication skills come into play.
In meetings with the product manager and marketing and finance teams, Greta’s knowledge in these domains allows her to ask the right questions and reason with the teams. After forming a clear understanding of the matter, Greta and Ken reformulate the problem.
Ken’s benchmark data showed that the company underperforms its peers. Audimax’s free-to-paid conversion rate needs to grow from 5% to the industry standard of 20%.
Now, that’s a clearly defined data-driven goal. Great! What does a data scientist do next?
Finding Simpler Solutions
Data science work takes a lot of time, so experienced specialists usually consider new requests with healthy skepticism. Greta explained to Ken that it isn’t a great idea to jump straight into building a sophisticated machine learning model without reasoning how necessary it is.
It’s always preferable to consider more straightforward solutions first, especially if they yield similar results. To determine that, a data scientist must know the business well.
Greta analyzed the competition and noticed that other companies were doing time-sensitive price promotions. She found this to be the most straightforward yet promising tactic and decided to test it first.
Her hypothesis reasoned that selling the product to free users at 50% off in the first five days after registering will help Audimax grow its free-to-paid conversion ratio from 5% to 20%, which would double the firm’s monthly revenue. She instructed the development team to implement an A/B test.
A/B testing is an integral part of the data scientist job description. This is an experimentation tool used to compare the results of two groups and give evidence on which version is preferable. It consists of splitting the firm’s new free customers into two cohorts: one that will see the promotion and one that will not.
While working on the A/B test, Greta had to consider such issues as sample size, test duration, and power of the test. Results clearly showed that a time-sensitive promotion produced a statistically significant effect.
The cohort’s revenue with promotion was 50% higher, and the free-to-paid conversion rate of the group saw the price-sensitive promotion jump to 10%.
In comparison, the cohort that didn’t see the promotion kept the same conversion rate of 5%. The hypothesis that a time-sensitive promotion would not have an effect was rejected.
Greta and Ken agreed that this was a change in the right direction, but they needed to work on improving the conversion rate to catch up with their peers.
So, what is a data scientist’s next move?
Exploratory Data Analysis (EDA)
Next, Greta conducted an exploratory data analysis (EDA) to understand better customer behavior and what could unlock additional value.
It took her several weeks to review the database, understand variables, clean various tables, and analyze the relationship between variables through correlations and data visualization.
Finally, she was ready. Her work led to an interesting observation. Free-plan users who engaged and listened to the product for at least 60 minutes in their first three days were four times more likely to become paid customers.
Greta shared this finding with Ken and Patrick (Audimax’s product manager). They agreed to run a test to unlock unlimited Premium content and exclusive features to free users for 24 hours after registering. They hypothesized that this would drive up user engagement, leading them to higher conversion.
The results from the A/B test were encouraging, so Audimax applied the changes to the entire population of free users. After a couple of months, Ken told Greta that the firm’s free-to-paid conversion ratio had grown to 15%. They were getting very close to the industry standard of 20%.
But they were not there yet.
Building ML Models
Simpler solutions had gotten Audimax’s team this far. As there were no other apparent “quick fixes,” perhaps they needed a more sophisticated solution to improve performance.
Finally, it was time for machine learning models—the essence of the data scientist job description.
One common topic that kept resurfacing in discussions with the product team was that some users found the product’s base price expensive, while others didn’t. There wasn’t an easy way to differentiate these users.
This led Greta to think that they could develop an ML algorithm that predicts the discount amount clients need to convert, then use that information to further increase conversion rates and revenue.
She began working on the task by gathering the necessary data, dealing with abnormalities, missing values, and date ranges, normalizing the data, figuring out the amount of data required, and all pre-processing work necessary to build an ML model.
After that, Greta spent time on feature engineering. Then, she tested different ML algorithms to decide which would be optimal for her dataset.
For a month, she trained different types of ML algorithms and compared their levels of accuracy, precision, and recall (performance metrics that indicate how well models operate).
After deciding which one to use, she spent time finetuning its hyperparameters and improving its performance. The validation of the model showed some promising results, and Greta put it in production.
There’s a range of specialists working with data, and sometimes, the roles overlap. Since data scientists have a wide range of skills, it is common for them to fill in other similar positions.
In this case, because Audimax is a small firm and doesn’t have an ML ops engineer on the team, Greta had to work with developers to productionize the model. She created a REST API to spit out predictions quickly and live in the app. This way, the user experience wouldn’t be affected.
Lastly, Greta set up an A/B test to ensure model results held when in a live environment. After a couple of months, financial figures showed significant improvement, primarily driven by Greta’s efforts. And everyone lived happily ever after.
What Does a Data Scientist Do: Next Steps
We hope this example motivates you to learn more about this exciting profession and bring about changes to help your company thrive.
If you feel inspired to become a data scientist, 365 is the best place to launch your career. The beginner-friendly courses from top industry experts and structured career tracks take you from the fundamentals all the way up to advanced specialization. Sign up with a free account and try out our learning platform with a selection of video lessons and exercises.