Hi! My name is Tina Huang and I'm a data scientist at a FAANG company. I hold a Bachelor's degree at the University of Toronto where I studied pharmacology. Subsequently, I worked in bioinformatics for a year and then did my Master's in computer science (MCIT) at the University of Pennsylvania. My experience includes an internship at Goldman Sachs where I did some machine learning work before I took up my current data science job in tech.
I am super excited to join the 365 Data Use Cases series, and in this post, I will share insights about my favorite data use case: product development.
What Is Product Development?
Product development is a huge and complex set of processes that rests on many moving parts, lots of hypotheses testing, and ultimately many decisions. The end goal is to create and grow a product that users love.
You can also check out our video on the topic below or scroll down to keep on reading.
Think about some of your favorite apps today: Instagram, Uber, Facebook, and YouTube. Instagram has an infinite scroll, and Uber has different options such as UberXL and Uber Pool.
Nowadays, we take it for granted that these features exist, but the integrity of those products is, in fact, a labor of love. It requires countless decisions to get them right, many of which have not been as straightforward at first. It’s the complexity of product development, with all of its moving parts, that makes data immensely powerful.
What Is the Product Development Process?
Data Needs in Product Development
In product development, we do lots of opportunity sizing to decide which features to build and what opportunities to invest in. This often involves analyzing similar products to compare how they performed. We can have either structured data (e.g. 1-5-star ratings) or unstructured data (such as social media reviews).
We also do experiments to test out the features we build and see how users respond to them. The data here can also be structured or unstructured. Ideally, we also want lots of data because this will give us more confidence in our results.
Once we have the data, we begin to tease out the high-impact insights. Remember, it’s the problem that defines the tools we use! As a general rule of thumb, we start from the simplest issue and gradually increase in complexity.
Types of Analysis: Traditional, Machine Learning, NLP, and Deep Learning
In the first place, we have traditional analysis. It involves lots of hypothesis testing and statistics which 365 Data Science also has a wonderful course on. Typically, it calls for a great deal of data heavy-lifting and data visualizations with SQL and Python or R.
A number of things can be done with traditional analysis, but it is machine learning that has taken an extremely solid place in product development. In practice, we use machine learning in different conjunctures. However, one of my favorites is using supervised learning techniques, such as random forest, SVM, and XGBoost, to discover which features contribute the most to the success of a product. These models are easy to implement as well as really helpful in deciding what to build and how much to invest in it.
Natural Language Processing (NLP) is another technique that has a well-deserved position in product development.
Finally, unsupervised clustering and more complex techniques, such as deep learning, also have their place. All in the name of developing an amazing product!
Why Is Product Development Important?
Product development is, by far, my favorite data science use case for two main reasons.
First, I like it because of how powerful (almost magical) it is in driving decisions. Here, data really gets to shine, in that it’s both the source of truth and the driver of insights. You’ll be surprised how much value there is in a quick XGBoost model, for example.
Second, data-driven product development meshes very well with my own philosophy, commonly known as the 80/20 rule, where 80% of the results come from just about 20% of inputs. In product development, 80% of the success of a product is determined by 20% of its features. Basically, you minimize effort and maximize outcomes. That’s exactly why choosing the right feature to build is so important!
I hope you’ve enjoyed reading this article. If you'd like to learn more about data science, transitioning into computer science, and software engineering, you can also subscribe to my YouTube channel. And if you're new to data science, check out the 365 Intro to Data and Data Science course.