When I first decided on data science as a career path, I was unsure as to how I could land a job in the field. Most formal data science qualifications are at a Master’s level, and I felt like pursuing a postgraduate degree would be a waste of valuable time and resources. Moreover, at the time I had no clue of how to build a portfolio of data science projects, nor did I appreciate how significant it really was.
I did a ton of research and realized that if I wanted to become a data scientist, there were other options at my disposal such as online learning.
I enrolled in a couple of data science online courses – I taught myself subjects like Python, SQL, machine learning, and data analysis.
Yet, my success in data science was far from certain.
There were many people just like me who wanted to break into data science through online learning. I remember seeing a popular Machine Learning course with over 3 million students at that time and wondering how I’d be able to set myself apart from them and get a job in the field.
I started questioning if the decision to self-learn data science was a mistake, and whether I should just pursue a Master’s in data science instead, to ensure a secure future in the industry.
I decided to give myself some time to learn everything I could in the field. Then, I would start applying to open data science listings and see if I’d be able to land a job. And if I couldn’t get a suitable position, I would consider pursuing a postgraduate degree.
Just two months later, I scored my first data science internship. This soon turned into a full-time position, and I’m currently working as a data consultant at the same company. Simultaneously, I’ve received multiple job offers on the side and have even taken countless freelance roles along the way.
How did this all happen?
In those two months, I created a data science portfolio showcasing all my knowledge in the field.
Hiring managers need to have the confidence that you can do the job. Remember, you are competing with applicants who possess Master’s degrees.
You need to show recruiters that you have the necessary data skills and should be able to stand out from hundreds of other applicants who have taken the same online course as you.
The best way to do this is by building a data science portfolio. Create projects that put your skills in practice. Tell a story about each project. Push your code to GitHub. Share it with as many people as you can.
As you keep doing this, your portfolio will grow, which improves your chances of getting noticed by employers and landing your first job in the field.
In what follows, I will walk you through the steps you can take to create a data science portfolio that stands out. By the end of this article, you should have a solid grasp of the following:
- How to create data science projects that help your portfolio stand out
- The importance of creating a GitHub account to display your work
- How you can start writing articles in the data science domain
- How to showcase all your work with a beautiful portfolio website
Step 1: Create Original Data Science Portfolio Projects
Be Creative
As I mentioned earlier, creating projects that you can display on your resume is one of the best ways to demonstrate your skills in the given area of data science.
However, when building a resume, make sure that you don’t only display basic data science projects. Popular Kaggle projects such as Titanic Survival Prediction or Iris Flower Classification are a great way to start teaching yourself data science, but won’t suffice to impress recruiters when applying for a data science position.
These projects are very popular among aspiring data science professionals, and many other applicants will have similar projects on their portfolio. Due to this, it will look to hiring managers as though your knowledge of the field is at a surface level, making it difficult for you to cut through the noise. That’s why I’d advise you to create unique projects. Put out something that tells a story and captures the attention of potential employers. If you’re not sure where to begin, have a look at the myriad open-access data sets available on the Internet to get some ideas.
Examples of Creative Projects for Your Data Science Portfolio
One of the first projects I created was based on gender disparity in the media. I’d just watched a show that discussed the Bechdel Test. This was a test used to measure female representation in Hollywood, and a movie could only pass the Bechdel Test if it met a specific set of criteria.
I thought it would be interesting to perform an analysis on Hollywood movies and look at the differences between movies that passed the test and those that didn’t.
Were movies with higher female representation directed mostly by women? Does the genre of the movie impact whether it passed the Bechdel Test? Has the gender gap in cinema improved over time?
I used an IMDB review dataset to perform this analysis. I also collected external data to assess whether these movies passed the Bechdel Test.
This wasn’t a highly complicated project. All I had to do was collect external data, join two datasets, and create a couple of charts that answered the questions I had in mind.
Once I was done, I wrote a blog post on the project. I also linked it to my portfolio. Although this analysis was simple, it captured the attention of many readers. I’ve had interviewers ask me about it. Because it tells a story.
When building any data science project, don’t create something that has already been done a hundred times before. If you want to stand out, come up with a unique idea and implement it.
For inspiration, you can read about 5 of the best data science portfolio projects I’ve worked on. Also, if you lack ideas on what to implement next, the 365 Data Science program contains a vast library of courses with practical examples and projects that you can add to your portfolio.
Showcase a Variety of Skills
Next, it is important to present different types of projects on your resume. Most data science positions require you to be skilled in multiple areas — data collection, analysis, machine learning, and visualization.
Here are a few areas that you should pay particular attention to when creating projects:
- Data collection, pre-processing, and analysis —As a data scientist, a majority of your time will be spent collecting and munging large amounts of data. Showcase your ability to extract external data with the help of tools such as BeautifulSoup and Python APIs. Use libraries like Pandas to clean and query dataframes. Here is a tutorial that will teach you how to scrape external data in Python, and if you want to go more in-depth with this topic here’s a Data Pre-Processing with Pandas course I highly recommend.
- Solve a business problem — A data scientist’s strength lies in their ability to answer questions with the help of data. Very often, you will be tasked with helping companies solve business problems. A machine learning model is just a tool that helps you do this and there is a clear distinction between ML, data analytics and business analytics. Develop a project with real-world business application. Here is a customer segmentation tool I built in Python that you can draw inspiration from. If you have data science experience but don’t know how to use it to add value to organizations, you can check out this Business Analytics course to get started.
Step 2: Building your Data Science Portfolio
Once you complete 3–4 projects, assemble them in a portfolio to showcase your skills.
Share Your Code
Create a GitHub account and upload code for each project there. Make sure your codes are clean and add comments to every section, so they are easy to read.
It is a good idea to have all your work in one place, so you can just upload a single link to your resume for potential employers to review.
Write Blog Posts
For almost every project I created, I wrote a blog post about it. I shared these articles on LinkedIn and added them to my resume.
Potential employers don’t always have time to read a bunch of code sitting in your GitHub repository. If you write about your projects and explain the steps taken to create them, hiring managers can easily skim through your articles to get the gist of the work you’ve done.
Also, as your posts gain traction and more people start sharing your work, your chances of getting noticed by recruiters and landing a job will increase.
The easiest place to publish data science articles is Medium.
All you need to do is create an account and start writing. Another option is to build a custom blog site from scratch. However, this can be expensive and difficult to set up when you are just starting out.
Create a Data Science Portfolio Site
Finally, display all of your work on a single webpage. This way, anyone who comes across your site can view all the projects you’ve worked on at a glance.
On my website, I added a simple introduction, an “About” section, and a “Projects” section where I explained the projects I’ve built. Finally, I shared links to my blog posts, code, and LinkedIn profile.
Data Science Portfolio Example
Here are some screenshots of different sections of my website:
- a) Summary of projects I’ve worked on at one glancе
- b) “About Me” Section
- c) Project details, links to my code and articles
I built the website above from scratch using HTML and CSS. I used a tool called GitHub Pages to host my site.
However, as a data scientist, it isn’t expected of you to have a background in web design or Frontend development. Instead of building your portfolio site by coding it, you can use tools like Wix or WordPress to showcase your work.
Q&A
How to Build a Data Science Portfolio: Next Steps
It isn’t necessary to hold a formal qualification in the field to get a data science job. In fact, there are many popular data professionals who have managed to transition into high-paying jobs in the industry without any technical background whatsoever.
I work with managers and senior professionals who are self-taught and are experts at what they do. When they source for candidates, they rarely put emphasis on an individual’s educational background. Rather, they look for driven applicants who are willing to put in the effort to learn continuously.
Before you start building a data science portfolio, you need to gain skills that are aligned with current market demands. Apart from Machine Learning models, it is vital for you to understand how business problems can be solved with the help of data.
You need to be able to collect, analyze, and present data to stakeholders in a clear and concise manner. Most beginner-level data science courses tend to place less emphasis on these areas and focus mostly on model-building. This is where 365 Data Science comes in.
The 365 Data Science Program offers self-paced courses led by renowned industry experts. Starting from the very basics all the way to advanced specialization, you will learn by doing a myriad of practical exercises and real-world business cases. If you want to see how the training works, start with our free lessons by signing up below.