How to Become a Data Scientist in 2020: Introduction
Data science has been one of the trendiest topics in the last couple of years. But what does it take to become a data scientist in 2020?
For the 3rd consecutive year, we have asked the data. Here is an interactive dashboard which summarizes the most exciting results from the past 3 years.
You can play around with our dashboard, slice and dice the data for the past 3 years.
If you wish to embed this dashboard to your website, please use the following code:
<iframe src="https://365datascience.com/research-1001-data-scientists-2018-2020/" width="1140" height="541.25" frameborder="0"></iframe>
If you prefer reading, though, please go ahead.
In a nutshell, here are the latest research results that we have found:
The typical data scientist is a male, who speaks at least one foreign language and has 8.5 years of work experience behind their back. They are likely to hold a Master’s degree or higher and most definitely use Python and/or R in their daily work.
But such generalizations are rarely helpful. Not only that, they could be misleading and sometimes discouraging. That is why we have sliced and diced the data to reveal a number of different insights:
- Previous experience
- Country and Degree
- Area of Studies
- Online courses and Degree
- Degree and Direct hires
- Years of experience
- Country and years of experience
- Coding languages
- Fortune 500 companies and coding language
- Country and coding language
Please use the list above to navigate through the article or simply read the whole piece. To give you the best perspective possible as we go through the different takeaways, we will also make comparisons to previous years’ surveys. If you first want to get acquainted with what it took to become a data scientist in 2018 and 2019, pleases follow these links:
How we collected and analyzed our data:
The data for this report is based on the publicly available information in the LinkedIn profiles of 1,001 professionals, currently employed as data scientists. The sample includes junior, experts, and senior data scientists. To ensure comparability with previous years and limited bias, we collected our data according to several conditions.
40% of the data comprises data scientists currently employed in the United States; 30% are data scientists in the UK; 15% are currently in India; 15% come from a collection of various other countries (‘Other’).
50% of the sample are currently employed at a Fortune 500 company; the remaining 50% work in a non-ranked company.
These quotas were introduced in light of preliminary research into the most popular countries for data science, as well as the employment patterns in the industry.
Alright, without further ado…
How to Become a Data Scientist in 2020: Overview
For the third year in a row, the verdict is in.
There are twice as many male data scientists as there are female.
This trend, while unfortunate, is not really surprising as the field of data science follows the general trend in the tech industry.
In terms of languages spoken, a data scientist usually speaks two – English and one other (often their mother tongue).
When it comes to professional experience, we find that you can’t really become a data scientist overnight.
It takes 8.5 years of overall work experience. Interestingly, this is an increase of half a year compared to the data in 2019. Another interesting observation is that data scientists have held their prestigious title for an average of 3.5 years. Last year, that metric stood at 2.3 years. While our study is not based on panel data, we can make the claim that once you become a data scientist, you are likely to stay one.
Regarding programming languages, in 2018, 50% of data scientists were using Python or R.
This number increased to 73% in 2019 to completely break all records this year. In 2020, 90% of data scientists use Python or R. And no, you are not the only one who finds it amazing. Such a high adoption rate in such a short time period is an absolutely stunning feat for any tool in any industry ever.
Finally, your level of education will most definitely make a difference when trying to become a data scientist. About 80% of the cohort holds at least a Master’s degree. That amounts to a 6 percentage point increase from last year.
Each year, we look at the previous work experience of a data scientist. This part of the results proved to be the most useful for aspiring professionals, figuring out the common career paths to becoming a data scientist.
To reiterate, in 2020, data scientists had 3.5 years with the title and 8.5 years in the workforce on average.
But… what did the data scientist do before becoming a data scientist?
According to our sample, they… were already a data scientist! Or at least half of the cohort (52.4%). If we compare this value with previous years, there were 35.6% such cases in 2018 and 42% in 2019. So, year after year, the position becomes more and more exclusive – an observation we could infer from their average work experience.
This insight suggests that there aren’t too many career options after being a data scientist.
In other words – once a data scientist, always a data scientist. At least that’s the situation in 2020.
Regarding other relevant career paths, starting out as a data analyst is still the preferable path (11% overall), followed by academia (8.2%) and… Data science intern (7.0%). This breakdown is one of the most consistent segments of our yearly research since 2018. Hence, you can bet your data scientist career on it.
Education is one of the 3 major sections of most resumes and that’s not likely to change. Educational background serves as a signal to your future employers, especially when you don’t have too much experience. So, what education gives the best signal if you want to become a data scientist?
According to our data, the typical data scientist in 2020 holds either a Master’s degree (56%), a Bachelor (13%), or a Ph.D. (27%) as their highest academic qualification.
These statistics might not seem counter-intuitive at first. However, there is actually a considerable drop in “Bachelor degree only” data scientists compared to 2019 (19%) and 2018 (15%). Data science requires an advanced level of expertise. And that’s typically acquired through graduate or postgraduate forms of traditional education, or through independent specialized study (see Certificates below).
But while specialization is important, too much specialization, such as a Ph.D. is not a prerequisite to breaking into data science. In fact, the percentage of PhD-holders has been unremarkably consistent over the years, constituting approximately 27% of our sample.
The Master’s degree, however, is solidifying its position as the golden standard of academic achievement necessary to become a data scientist in 2020.
We are observing a 20% increase in the professionals who hold a Master’s degree compared to the 2019 cohort (46% in 2019 vs 56% in 2020).
A Master’s degree is a great way for a Bachelor to specialize in a given field.
Generally, there are two types of Master’s degree choices:
- increasing your depth (dig deeper into a topic)
- or increasing your breadth (change your focus to diversify your skillset).
One assumption is that people with Economics, Computer Science or other quant Bachelor’s degree have pursued a trendy data science Master’s. This is further corroborated in our section on fields of study.
Arguably, there is another factor at play here as well, and this is the increased popularity of the field.
Industry reports like Glassdoor’s 50 Best Jobs consistently named Data Science the winner in 2016, 2017, 2018, and 2019.
Google searches for data science have at least quadrupled over the last five years as well. This certainly plays to the increased interest in data science as a career, and as a result, to a more selective hiring process in certain regions (see Country and years of experience below).
Finally, although data science is becoming a more competitive field, more than 10% of data scientists successfully penetrate the field with only a Bachelor’s degree (13%). It’s true the number is lower than what we’ve observed in the last two years (19% in 2019 and 15% in 2018). Nevertheless, data science remains accessible to Bachelor holders. In fact, if we look at country-specific data, a more nuanced picture emerges.
Country and Degree
As we stated in the Methodology section in the beginning of this article, we gathered our data according to location quotas; data scientists in the USA comprise 40% of our data, data scientists in the UK contribute to 30% of our observations; India and the rest of the world each comprise 15% of the 2020 cohort.
That said, the increase in data scientists holding a Master’s degree is widely observed in both the UK and the States (54% and 58%, respectively, compared to 44% in 2019).
In India, the number of data scientists holding a Master’s has also grown by 16% in 2020, compared to previous years (57% in 2020 vs 49% in 2019 and 2018).
Interestingly, this doesn’t correspond to a comparable decrease in data scientists who have an undergraduate degree in India (32% in 2020, compared to 34% in 2019), which is still the highest percentage of Bachelor-holders across our cohort. Both Ph.D. graduates and professionals holding degrees from our “Other” cluster are also seen less frequently in the current research than they were in previous years. As we mentioned above, it is plausible that a specialization with a “trendy” data science Master’s is becoming the preferred career path of many people in the field.
It’s also worth noting that you don’t need a Ph.D. to become a data scientist in India.
In fact, postgraduates with a Ph.D. make up only 3% of our data scientist sample in India; this is both 30% less than the US data, and the least represented cohort in India.
So, these data corroborate two tentative conclusions. Academically, a Master’s degree is establishing itself as the most popular degree for becoming a data scientist across the globe. And, if you are holding only a Bachelor’s degree, India provides the best career opportunities for starting a career in data science.
Area of studies
What is the best degree to become a data scientist? If you have followed the industry (or at least our research) over the past years, you would be inclined to respond with ‘Computer Science’ or ‘Statistics and Mathematics’. After all, data science is the lovechild of all these disciplines. But you would be mistaken.
In 2020, the best degree to become a data scientist is… Data Science and Analysis!
At long last – ‘Data Science and Analysis’ graduates have made their way to the top of our research!
Before we continue with this analysis, a note on methodology. Because there is a massive number of uniquely nuanced – and correspondingly named – degrees in the academic world, we grouped our data into seven clusters of areas of academic study:
- Computer science, which does not include machine learning;
- Data science and analysis, which includes machine learning;
- Statistics and mathematics, which includes statistics and mathematics-centered degrees;
- Natural sciences, which includes physics, chemistry, and biology;
- Economics and social sciences, which includes studies pertaining to economics, finance, business, politics, psychology, philosophy, history, and marketing and management;
- Other, which includes all other degrees the data scientists in our sample pursued.
So, Data science and analysis is finally the degree that’s most likely to get you into data science. Awesome!
Compared to both 2019 (12%) and 2018 (13%), we’re seeing a significant increase in the professionals who’ve graduated with a data science specialized degree in 2020 (21%). Given our previous observations (see Education above), it doesn’t come as a surprise that the majority of these degrees are at Master’s level (85% of the Data science and analysis cluster). Therefore, it seems like data science is a preferred specialization for any quant Bachelor.
This finding suggests traditional universities are beginning to respond to the demand for data scientists. And, in line with that, offer curriculums that develop the data scientist skillset. Another marked trend is that the Data Science and Analysis degree is becoming the affirmed gateway degree into data science, especially if you’ve previously graduated from a different field.
Consider, for example, the top 3 degrees obtained by data scientists in 2019 and 2020:
- Computer Science (22%)
- Economics and social sciences (21%)
- Statistics and Mathematics (16%)
- Data Science and analysis (21%)
- Computer Science (18%)
- Statistics and Mathematics (16%)
Data Science and Analysis has obviously taken the lead from Computer Science.
What’s more, its appearance has completely removed Economics and social sciences from the top 3 ranking, even though this specialization was a close second in 2019.
Graduates form the Engineering, Natural Sciences, and Other fields constitute approximately 11% of our data each. And, we can say this hasn’t changed much compared to previous years.
Interestingly, most women in our sample most likely earned a Statistics and Mathematics related degree (24% of the female cohort).
In comparison, men most likely earned a degree in Data Science and Analysis (22%), with Computer Science (19%) being a close second.
In general, data science is considerably well-balanced in terms of best degrees to enter the field.
You can become a data scientist if you have a quant or programming background… Or if you further specialize in Data Science and Analysis. And the way to do that is either through a traditional Master’s degree or by completing a bootcamp training or specialized online training programs.
How to Become a Data Scientist in 2020: Online courses and Degree
With data scientists coming from so many different backgrounds, we may wonder if their college degrees have proved sufficient for their work.
Even with no research, the answer is – no way. No single degree can prepare a person for a real job in data science.
Actually, data scientists are closer to ‘nerds’ than to ‘rock stars’ – it’s less about talent and more about hard work. Therefore, you can bet that they take their time to self-prepare. In our research, we have used the closest LinkedIn proxy available – certificates from online courses. Our data suggest that 41% of the data scientists have included an online course, which is practically the same as the past two years (40% in 2018 and 43% in 2019).
Note that not all people post all their certificates, so these results are actually understatements.
Degree and Direct hires
Can you become a data scientist right after graduation? While not unheard of, the data suggest that it is unlikely. Less than 1% of our cohort succeeded in becoming a data scientist without previous experience. And they either had a Ph.D. or a Master’s (80% of these men, and 100% of the women). A quarter of these direct hires also reported having received an online certification.
Something we found interesting is that the direct hires in our cohort almost completely mirror the profile of the typical data scientist in 2020 (see above).
That said, let’s discuss what kind of experience you need to become a data scientist, if you’re not in that lucky 1%.
Years of experience
The typical data scientists in 2020 has been working as a data scientist for at least a year already (70% of our cohort), with the highest number of data scientists being in their 3-5 years bracket (28%) followed by data scientists in the 2-3 years bracket (24%), and in their second year on the job (19%).
Data Scientists in their first year on the job constituted 13% of our 2020 data.
These are all interesting statistics, especially when considered in relation to 2019 and 2018 data. More specifically, we’re observing a nearly 50% decrease in the number of data scientists who are just starting out their careers in 2020 (13%), compared to data scientists starting out in 2019 and 2018 (25%). Given the increase in average experience as a data scientist, we can conclude that these professionals stay within the field, making it harder for junior people to enter.
The second interesting trend here is the increase in number of data scientists who are in their 3-5 and 2-3 years on the job, compared to the past two years.
In 2018, 25% of data scientists had more than 3 years of experience, whereas in 2020, this number is reaching 44%, constituting a 76% increase in this cohort. This indicates that data science experts and senior data scientists are staying in the field, rather than moving to some other industry.
Nonetheless, we mentioned that there are some important cross-country differences that invite further exploration. So, let’s consider these in more detail in the next section!
Country and years of experience
A cross-country analysis of the on-the-job experience of the data scientist reveals a curious trend.
In terms of seniority, the data scientists in the US cohort were certainly the most experienced in our data.
More than 50% of the cohort were at least on their third year working as data scientists, with 20% on the job for more than 5 years. Тhe US is the least friendly environment for career starters in data science. Only 8% of our US cohort was in their first year as data scientists, and 15% – in their second.
According to our data, the data science field in the UK is easier to penetrate.
11% of the UK sample were starting out their career as data scientists, whereas 20% were already in their second year on the job. Nonetheless, the largest represented group in the cohort were professionals in their third or fourth years on the job (29%).
If you’re looking for the country that offers the most opportunities to career starters, the data suggests that this is India.
More than 50% of our sample consisted of data scientists within their first or second year on the job. This is great news for someone who is just getting started with data science and wants to nurture their expertise into a career.
Of course, this data doesn’t come as a surprise, with some of the world’s largest companies opening offices in Bangalore and Hyderabad, including Amazon, Walmart, Oracle, IBM, and P&G.
The rest of the world, or our “Other” country cluster shows a more balanced distribution of data science professionals regarding years of experience. A little less than 20% of the cohort is in their first or second year as data scientists, a little over 20% are in their third or fourth, and a quarter were in the 3-5 years bracket. That said, it’s worth mentioning that the largest players in our “Other” country cluster were Switzerland, the Netherlands, and Germany. Therefore, we can tentatively say that data science is becoming a more prominent field in Western Europe, and since the field is not yet flooded with data science talent, both junior and mid- to senior professionals are in demand.
Programming skills of a data scientist
When looking for programming languages proficiency, we had to turn to the LinkedIn skills ‘currency’ – endorsements. While an imperfect source of information, they are a good proxy of what a person is good at. I would not be endorsed by my colleagues for Power BI, if I were mainly training ML algorithms, would I?
With this clarification out of the way, let’s dig into the data. Python dethroned R a year or so ago, so we won’t comment too much on this rivalry. Moreover, knowing that 90% of the data scientists use either Python, or R, we could completely close the topic here and move on.
But that would be a bit ignorant, especially towards SQL!
74% of the cohort “speaks” Python, 56% know R, and 51% use SQL. What’s especially noteworthy here is that SQL has grown in popularity by 40% since 2019 (36%), making it a close third after R. Now, there are various factors that could contribute to this number.
One possible explanation is that companies don’t always understand the data scientist position well. This leads them to hire data scientists and overload them with data engineering tasks. For instance, the implementation of GDPR and the massive reorganization of data sources in data warehouses placed some data scientists in the unfavorable position to lead or consult on such projects. Inevitably, SQL had to be added to their toolbelt for the sake of ‘getting the job done’.
This phenomenon is getting more and more attention not only in the context of SQL, but also Big data structures related to database management. As a result, data scientists have acquired new skills at the expense of writing fewer machine learning algorithms.
Another important point in favor of SQL is that BI tools such as Tableau and Power BI are heavily dependent on it, thus increasing its adoption.
And that’s why SQL is going further up, even catching up with R. The programming languages picture is completed by MATLAB (20.9%), Java (16.5%), C/C++ (15.0%), and SAS (10.8%). Once again, LaTeX (8.3%) is also in the top 10.
Well, academia does not harm your chances to become a data scientist as we see from the background of our cohort.
F500 and coding language
We can’t stress enough how important are Python and R for the data science field in 2020. However, their strengths are their flaws, when it comes to big companies. Python and R are both open source frameworks that can be buggy or not well documented, unlike well-established languages such as MATLAB or C.
And the data does indeed confirm this claim. Take Python for one – 70% of F500 data scientists employ Python against 77% of non-F500 data scientists. This sounds like unpleasant news, but in fact, it isn’t. Both Python and R have been closing the gap over the years. It seems like F500 companies are rethinking their organizations and are much more inclusive of the new technologies as compared to the data in 2018.
Apart from the different rate of employment of Python, the rest of the breakdown by coding languages remains uninterestingly consistent.
Country and coding language
In the past, your country of employment would dictate many of your life decisions – what language to learn, what rules to abide by, and what customs to respect or adopt. But does this apply to coding languages?
Since 2018 we look into USA, UK, India and ‘Rest of the world’. Our findings used to show that R was ‘winning the people’ over Python in the USA and India. On the other hand, UK and ‘Rest of the world’ were already slowly phasing out R in favor of Python.
Well, USA and India are no longer ‘lagging behind’ when it comes to Python adoption. In other words, Python is now king in all countries. Hence, your best bet at becoming a data scientist is to bend the knee and join the Pythonistas in their search for data-driven truth.
For the record, the breakdown by coding language is consistent across countries with R and Java taking the biggest hit from the Python supremacy in 2020. SQL remains unaffected and even gains a bit of traction as compared to previous years.
How to Become a Data Scientist in 2020: Conclusion
For a third consecutive year, the 365 Data Science research into 1,001 current data scientists’ LinkedIn profiles reveals e. And what a year it is!
This research reveals that the field is ever-evolving and adapting both to the needs of businesses and its growing popularity in academia and around. Universities are catching up with the demand while Master’s is establishing itself as the golden standard degree.
Python continues to eat away at R, but SQL is on the rise, too!
India has earned the spot of best country for starting a career as a data scientist by demonstrating higher demand for junior data scientists than the US and the UK. It is also the place to be if you only have a Bachelor’s degree.
Of course, we are tremendously interested in how these trends will develop in the following 2-5 years. But in the meantime, let us know if you think we’ve missed anything of interest! We are on a mission to create an informative and ultimately helpful account of the data scientist job and how it changes with time. After all, making the best career decision for yourself means being informed!
So, stay curious, grow your programming skill set, and good luck in your data science career!
Links to other studies:
Ready to take the next step towards a data scientist career?
Check out the complete Data Science Program today. Start with the fundamentals with our Statistics, Maths, and Excel courses. Build up a step-by-step experience with SQL, Python, R, and Tableau. And upgrade your skillset with Machine Learning, Deep Learning, Credit Risk Modeling, Time Series Analysis, and Customer Analytics in Python. Still not sure you want to turn your interest in data science into a career? We also offer a free preview version of the Data Science Program. You’ll receive 12 hours of beginner to advanced content for free. It’s a great way to see if the program is right for you.