Becoming a successful data scientist is easier said than done. Especially if you’re a newcomer making their first adventurous steps into the field.
Sure, there are tons of super-helpful resources about the technical side of data science (which we utterly adore from the bottom of our geeky hearts). However, there’s another hefty challenge every aspiring data scientist needs to overcome. The challenge is the lack of clear understanding of what you should expect.
This is why we reached out to 3 outstanding data scientists with very different backgrounds, industries, and expertise, and asked them for some practical advice. And, in the spirit of sharing expert opinion and wisdom with the data science community, they were totally happy to lift the veil for you on the following:
- what’s missing in data science college education;
- the top challenges you'll have to face in order to become a successful data scientist (and how to overcome them);
- the most valuable advice that will help you become a successful data scientists if you are just getting started or transferring from a completely different field.
We’re sure their remarkable insights will inform and inspire you, no matter where you are in your data science career preparation and development.
So, here’s what our data scientists have to say.
On what’s missing in data science college education…
“Generally speaking, good software engineering practice is lacking in schools. Sometimes, you work as a team in undergrad but the team comes together for the purpose of building one project, showing it to your Professor, and then walking away with a grade. Whereas for real software projects that really contribute to the world, you really need to write code in a way that would be understandable.
That’s the biggest hiring principle that’s missing at universities and is necessary in the workforce. And in grad school, we missed out on this, too. I went to grad school where you’re writing code for yourself, you’re not writing code for other people to read. Even when you publish your code, you don’t really care that other people are going to read it. You just want to get it published.”
“There are three things I think that colleges can do to keep its candidates attractive and viable in a job market that will tighten at some point, and is already under assault from alternative routes. First, require more technical acumen in-class assignments and projects to ensure students are applying what they’re learning, rather than engaging in a ‘check the box’-style engagement. Second, actively start prepping students for internships and interviews that are so necessary today for good full-time positions upon graduation. And third, make a networking and communications class compulsory for the students.”
“Math and algorithms can be studied and learned. Attitude is harder to acknowledge and to change. I’ve worked with many junior data scientists so far. They’ve just completed a university degree or a specialization course in data science, and they think they are done and have nothing more to learn.”
On the challenges you need to overcome as a beginner to become a successful data scientist…
“Well, there are different types of beginner data scientists. But I’m going to imagine someone fresh out of school.
So, for someone fresh out of school, one of the challenges they face is writing code that will be read by a lot of people.
A second important one is being able to talk about your results with a business-side person. This kind of collaboration between business and tech doesn’t really happen in university. This is how the real world works, and yet, we’re not trained for this.
So, the ability to build an interface constructively with the other side and understand the problem of the business that is either losing you money or not making you enough of money is crucial.
And the third thing is, often, in very large companies, there are a bunch of cultural norms that an outside person is not familiar with, like, what’s the first thing you do in a meeting at that company? Basically, all of these are a dealing-with-people type of challenges. So, once you’ve got the technical training, you should reframe your mind, so that you can deal with other human beings constructively to build a productive enterprise, together. And that’s hard. It’s very hard to get large groups of people all rowing together in the same way. And it’s even harder when you’re just joining and you have no idea of the cultural norms and the consequences of mistakes that are easy to make early on.”
“Technical skills are only one required set of skills necessary for succeeding as a data scientist.
Without communication skills, it will be difficult for the data scientist to rise up in the ranks and hold their own in leadership positions.”
“The first big challenge is in acknowledging that this job requires (and will keep requiring) continuous learning. There is no data scientist who knows it all. Or at least I have never met any. We all specialize in some specific techniques, data domains, or business cases. And even in what we know best, new optimization techniques, new loss functions, and new approaches often appear, and we need to learn them again. The university courses and the specialization courses all give us the capability of learning new techniques and applications quickly, but a big part of our job consists of continuous learning.
Another necessary change in attitude is about task organization.
A junior data scientist often thinks that their job is just to train machine learning models, which “automagically” generate fantastic insights and leave the customers in awe. Well, it is not that simple. Actually, training and applying one or more machine learning models is the easiest part. There are libraries in any tool exactly to do that. One node or one line of code will probably do the trick. The main challenge in a project comes before and after you train the model. Before that, you need to prepare the data so that they are clean and describe the problem accurately and informatively, making the model training much easier. After you train the model, you need to optimize it, as well as interpret and communicate the results.
All of these tasks are an integral part of a successful data science project and part of the data scientist job.
In addition, the data scientist can help to clean the data, for example. Not only manually but by creating and proposing more efficient automatic AI-based solutions. I insist on that because bad data produce bad results, no matter how smart the machine learning algorithm. So, cleaning data or presenting results is also an important part of the successful data scientist job. Finally, no AI-generated insight is as admirable if it can’t be communicated properly.
I know most of us come from a science or computer programming background and might not be well-versed with words. However, communication of the final results is as important as the solution itself. Many data science solutions fail during the deployment phase, and one of the most common causes is the inability of the data scientists to effectively communicate the power of the achieved results (see blog post: “The Deployment Pain”). One of the skills junior data scientists often lack is communication, both in speaking and writing, and they will need to learn it and master it if they want their data science solutions to be successful.”
On the most important things to remember if you’re just starting out (or transferring into data science from a completely different field)…
"It depends a lot on the career, but the broadest useful advice is: leverage your domain knowledge.
Essentially, you want to tell a story and also create a narrative about yourself. And the narrative you create about yourself when you transition is not “Oh, I’m changing everything about myself.” It’s more like, “No. I’m moving away to even further increase the value of the experience that I already have.” And too many people who want to make that transition are in a way embarrassed. But that’s the wrong way to look at it. You have to look at it as “I used to be this and I was awesome at it. This makes me even more awesome. Because now I have this amazing skill set and knowledge and also all of these tools available to me that make me even more of that.”
And one last thing. The only certain way to fail is to give up. And I’ve seen people that I’m sure are definitely getting a job. But for various reasons they don’t believe in themselves. And, sometimes life happens when you have to just make ends meet – that’s the way life is… But usually, success is just one notch above the level where you’re discouraged enough to give up. So, yes, there’s a wall but you can break through the wall if you push further."
“No matter where you go or what you do (with very little exception), you’re going to be dealing with people. People interact with technology in different ways and need technology for different things. You can learn the most valuable lessons only by talking to different people about their needs. As technologists, we tend to stake our claims on solutions rather than problems. And we’re still very driven by features and automation. But the most valuable lesson I learned is to start from the problem and ask myself some cold, hard questions. What is the simplest possible solution to this problem and why isn’t it enough? Is it possible that I’m biased towards solution X or Y because I want X or Y to succeed as opposed to just solving the problem in the most efficient way possible?”
“Learn the math behind the algorithms, not just how to apply them in a script.
Throughout your career, you will need to learn new tools. However, learning a new tool will be easier if you have a solid grasp of the math behind your data processing techniques.
Acknowledge that in this field you will never stop learning. It is the good and the bad thing of working as a data scientist. On one hand, your brain will never stop acquiring new concepts; on the other, you’ll need to invest time to acquire new concepts.
Take every new project as a great chance to learn something new... From a book, a colleague, experience, or something else. You never know where the next piece of new knowledge will come from!”
In Conclusion - The Path to Become a Successful Data Scientist
Data science is challenging. But it’s also a super rewarding field to start a career in! And sometimes, to become a successful data scientist, all you need is a little bit of extra motivation to keep you going. So, whenever the data science learning curve gets a bit steeper, we hope you’ll refer to this article for a daily dose of data science inspiration.
Special thanks to:
Edouard Harris is a successful data scientist and the co-founder of a company called SharpestMinds - a machine learning mentorship program based on income share. He was a physicist for about 10 years before he transferred into data science.
Mayank Kejriwal is a research assistant professor at the University of Southern California’s Department of Industrial and Systems Engineering. He is also a research lead at the USC Information Sciences Institute (ISI) and works on projects for the initiative AI for Social Good.
Rosaria Silipo is currently a Principal data scientist at KNIME. She earned her doctorate degree in biomedical engineering in 1996. Rosaria has been working in the field of data analytics for 25 years. She’s also the author of the “Practicing Data Science” ebook and the KNIME e-learning course on data science.
Ready to take the next step towards becoming a successful data scientist?
Check out the complete Data Science Program today. If you still aren’t sure you want to turn your interest in data science into a solid career, we also offer a free preview version of the Data Science Program.
You’ll receive 12 hours of beginner to advanced content for free. It’s a great way to see if the program is right for you.