30 Data Scientist Interview Questions and Answers

Join over 2 million students who advanced their careers with 365 Data Science. Learn from instructors who have worked at Meta, Spotify, Google, IKEA, Netflix, and Coca-Cola and master Python, SQL, Excel, machine learning, data analysis, AI fundamentals, and more.

Start for Free
The 365 Team 11 Apr 2024 11 min read

Landing an awesome data scientist job isn’t just a luck of the draw. Above all, it’s a matter of preparation. But even if you’re an aspiring data scientist who’s super dedicated to the task, you might find yourself struggling in the process. Why? The reasons are two-fold:

First, the data scientist interview format can vary greatly depending on the company you apply at.

Second, data scientist interview questions cover a wide scope of multidisciplinary topics. That means you can never be quite sure what challenges the interviewer(s) might send your way.

So, we pulled our data science brains together, got in touch with recent hires and interviewers, and compiled a punchy interview guide. First, we’ll discuss the best possible preparation in terms of data science skills and qualifications. Then, we’ll list the data scientist interview questions you’re most likely to get (with answers). Finally, we’ll let you in on the specifics of the data scientist interview process in 3 major companies.

How to Prepare for the Data Scientist Interview?

Regardless of the company and business field, you can’t possibly answer data scientist interview questions without the knowledge and technical skills, such as:

Speaking of preparation, if you’re ready to start your career in data science, but you need to improve your skillset, you can register for the complete Data Science Program today. Start with the fundamentals with our Statistics, Maths, and Excel courses, and build up step-by-step experience with SQL, Python, R, Power BI, Tableau, and more.

What Data Scientist Interview Questions can you anticipate?

Being familiar with the type of data scientist interview questions you can encounter is an important aspect of your preparation process. Below you’ll find examples of real-life data scientist interview questions and answers. Reviewing those should help you assess the areas you’re confident in and where you should invest additional efforts to improve.

General Data Scientist Interview Questions

Here are a few examples of warm-up data scientist interview questions that will get you ready for the more in-depth inquiries ahead:

1. Tell me about yourself.

This will probably be the very first question of the interview. A very generic question, which is tougher than it sounds. You need to avoid telling the story of your life, but you don’t want to pause after three sentences either. Given that it is the opening question of the interview, your answer becomes even more important, as it sets the tone for the rest of the conversation. The Hiring Manager wants to see if you can structure well the answer to a very broad question.

The secret for responding well to this question is scripting and practicing before every interview. What should you include in your response?

  • Tell the interviewer only facts that you want him/her to know
  • Give a hint about your personal life with one or two sentences. For example: “I was born and raised in the UK”. Or “I moved to New York because it is a vibrant city and I like the dynamic environment.”
  • Show that you are perfect for the job under consideration; you have the right education; and that your previous work experiences will be a valuable asset to the firm;
  • Conclude by explaining why you are excited about this possibility and how your strengths match with the profile that the company is looking for.

Prepare a script that addresses each of the points above and practice answering the “Tell me about yourself” question, as you know it’s coming your way once an interview starts.

2. What relevant work experience do you have?

A straightforward question, which leaves little space for maneuvering. Make sure that you prepare well before the interview. Carefully study the job description and identify how your work experience is going to be useful in handling the responsibilities at this new position. Try to be specific and point out the activities that you learned to do in your previous jobs. Explain how they would allow you to perform well at this new position.

3. Where do you see yourself in 5 years?

A potentially dangerous question. The interviewer wants to know if the company can count on you in the long run - whether you are looking for a job to tide you over or for a career. Besides hiring someone that is qualified and skilled, most firms want to choose a person that believes in a future with the company. They don’t want to invest a great deal of time and money in order to recruit and train someone who will leave in two years.

The hiring manager wants to understand exactly how you think.

Perhaps you intend to gain one or two years of practical work experience and join your family’s business. Maybe you want to start your own company, or maybe you believe that one or two years at this job would allow you to pursue much more interesting opportunities with other companies.

This is why you need to be prepared and have a good answer in mind.

Instead of replying where you will be in 5 years, which is kind of dangerous for the above-mentioned reasons, you can talk about exactly what you would like to learn in the next five years. You can say that you want to become very good at what you do; gain hands-on practical experience in managing people; and that you always wanted to become a technical expert in the field for which you are interviewing. As a closing statement, you can add that you are excited about this opportunity because you believe that it is a step in the right direction and would allow you to achieve your goals.

By spinning the question in this direction, you are able to achieve three things. First, you protect yourself from answering a potentially dangerous question. Second, you will be able to emphasize that the main driver in your career is professional growth and self-improvement. And third – you are able to affirm that you are excited about the job opportunity. Sounds good, right?

Similar versions of this question are “What do you want to achieve in your career?”, “Describe your ideal job”, “What are your long-term career goals?” The same logic applies to all of these data scientist interview questions too.

4. How are missing values and impossible values represented in R?

One of the main issues, when working with real data is handling missing values. These are represented by NA in R. Impossible values (division by 0, for example) are represented by NAN(not a number).

5. What is an example of a dataset with a non-Gaussian distribution?

First, it may make sense to research what is a Gaussian distribution. In fact, it is also known as ‘Normal distribution’ or ‘The Bell Curve’.

Once you are sure that you know what a Gaussian distribution is, we can proceed to the question at hand.

We established that for a distribution to be non-Gaussian, it shouldn’t follow the normal distribution. One of the main characteristics of the normal distribution is that it is symmetric around the mean, the median and the mode, which all fall on one point. Therefore, all we have to do is to select a distribution, which is not symmetrical, and we will have our counterexample.

One of the popular non-Gaussian instances is the distribution of the household income in the USA: data scientist interview questions and answers, statistics graph

You can see where the 50th percent line is, but that is not where the mean is. While the graph is from 2014, this pattern of inequality still persists and even deepens in the United States. As such, household income in the US is one of the most commonly quoted non-Gaussian distributions in the world.

Technical Data Scientist Interview Questions

Statistics, programming, machine learning – those are all bound to come up at a certain point in the data scientist interview process. Here’s a list of technical data scientist interview questions you can use for practice.

6. How do you explain Random Forrest to a non-technical person?

Random Forest is a classification algorithm. Its main purpose is to match a specific observation with its observed outcome.

An important defining characteristic of a random forest is that it is simply a collection of decision trees. There are many terms involved, but in fact, the concept is rather simple and could be easily illustrated with an example. Let’s say you want to create a meeting. A decision tree for that meeting may be:

  • Monday
    • No
    • Yes
      • 1PM to 2PM
        • No
        • Yes
          • Room 160
            • No
            • Yes
          • 3PM to 4PM
            • No
            • Yes
              • Room 155
                • Yes
                • No

And so on...

Based on this tree, we would normally estimate probabilities to have the meeting in one place or another.

The main issue is that this is a very bad classifier. However, combining many such trees we reach a random forest. The underlying assumption is that many bad classifiers equal a good classifier. Each tree makes a prediction (which observation to put in what class) and then the class with the most “votes” across all trees will be our random forest prediction.

7. What's wrong with training and testing a machine learning model on the same data?

This is one of the more common data scientist interview questions. When we are training a model, we are exposing it to the ‘training data’. This means it is learning the patterns from it. By the end of the training, it becomes very good at predicting this particular dataset. However, sometimes we may overfit. This is a situation where we keep improving the accuracy, but not because the model is good, but just because it has learned every little detail about the data it is given.

If we test on that data, we will be checking the accuracy of the training. This is not a test per se. That’s simply a ‘train accuracy check’. Our model will seem to be very accurate and working properly. But that is because we trained it on that same data. We are essentially asking the model to predict what was already predicted, which is not a hard task.

To truly test a model, we must expose it to data it has never seen before. This will reveal if it learned the patterns of the population, or simply the noise in the training data.

8. How to make sure you are not overfitting while training a model?

First, we need to clarify what overfitting is exactly. Usually, overfitting happens when your model fits the training data so well that it misses the point. In other words – it doesn’t look for the general patterns, but for the noise in the data provided. If that happens, when provided with new data, the model behaves disastrously in a real-life setting.

Regularization - In the context of machine learning refers to the process of modifying a learning algorithm so as to make it simpler often to prevent overfitting or to solve a badly posed problem.

  • Early stopping – early stopping is the most common type of regularization. It is designed precisely to prevent overfitting. It consists of techniques that interrupt the training process, once the model starts overfitting.
*Here you may be expected to say ‘validation’ or ‘cross-validation’. In fact, early stopping methods always use the outputs from the validation to determine whether to stop the training process.
  • Feature selection – for some models, having useless input features leads to much worse performance. Therefore, you have to make sure to choose only the most relevant features for your problem otherwise this may affect (among other things) overfitting.
  • Ensembles are methods to combine several base models in order to produce one optimal predictive model. A good example of the ensemble method is Random Forest (a collection of decision trees).

It is very important to realize that overfitting is an extremely important issue. Every model will overfit if no preventative techniques have been implemented. Therefore, you should always aim to apply one or more of these techniques in your model building efforts.

9. What is cross-validation? How to do it right?

Cross-validation refers to many model validation techniques that use the same dataset for both training and validation. Usually, it is on a rotational basis so that observations are not overexposed to the training process and thus can serve as better validation. It is mainly used in settings where the goal is prediction, and one wants to estimate how accurately a predictive model will perform in practice.

Why do we even need to validate?

Well, when you use sample data (so most of the time), you need to make sure that your model is not overfitting the parameters.

So how do we validate? We take out like 10% of the data for later use and train on the remaining 90%. Once we are done, we validate on the 10% we set aside at the beginning. This is a pretty common practice but has one major drawback - some of the data (these 10% precisely) is not really utilized in the training process.

That is where cross-validation comes in.

Cross-validation does the same thing as simple validation, but it first divides the dataset into equal parts (5,10,20 depending on the size of data). To cross-validate, it sets aside the first part and trains on the remaining parts. Then it sets aside the 2nd part and trains on the remaining ones (this time, including the first part). We continue in that way, utilizing a different subset for each validation. In that way, the model gets exposed to all the data in contrast to conventional validation.

10. How do you create a table in R without using external files?

This is practical knowledge that can be tested with a coding task, but it’s possible you are asked this question as a stand-alone. In that case, you can ask about the use case for the numbers you’re generating.

First, if you need a data table for the sake of having data to test on, you can just use one of R’s preloaded datasets. You can access the list by calling data().

If you’d like to still create a table from scratch, you can use any of the random generator functions in R to generate random numbers according to a distribution, and store them in a matrix or a data frame. The functions are:

  • runif()
  • rnorm()
  • rbinom()
  • rexp()

You can also use sampling with or without replacement to generate your data and populate a table.

If you need an empty table to be filled out later, you can initiate empty vectors and create your data frame.

Finally, you could use an interactive method, for a quick solution to a small-scale problem. Creating an empty data frame and editing it with edit(df) will toggle an interactive spreadsheet view you can manually populate.

Note: this counts more towards #Rtrivia.

11. Explain the significance of Transpose in R.

Transpose is one of the simplest ways you can reshape a data structure in R. If you transpose a data frame or a matrix, you will essentially be rotating the data, so rows become columns, and vice versa.

In terms of use cases, transposing is sometimes needed to tidy data for analysis. If the raw format has observations recorded as columns, transposing the data structure would ensure the data is keeping to the convention whereby observations are organized in rows and variables are represented by columns. Transposing is also necessary for matrix multiplication, used vastly in machine learning, deep learning, etc.

The t() function is the default way to transpose in R. If this simple function fits your needs, you don’t need anything else. If the data you’re trying to reshape is messier, {dplyr} and {tidyr} can provide a good set of functions to deal with it – grouping, mutating, pivoting…

12. Why would you use a Null as a data value?

To answer this question, it’s important not to confuse a NULL value with the value of 0 or with a “NONE” response. Instead, think of a null value as a missing value. 0 or “NONE” could be values assigned by the user, while “NULL” is a value assigned by the computer if the user has provided no value for a given record.

Consider the Customers table below:

data scientist interview questions, sql customers table  

If you know John McKinley has filed 0 complaints, then in the “number of complaints” column in the “Customers” table, you could insert 0. This doesn’t mean the value is null – not at all! You have a value of zero, and the information in this field for this first record is not null. It means John has filed no complaints.

If we have no information regarding the number of complaints John has filed, then the value would have been null.

By the same logic, imagine there was an additional column, called “Feedback”, and that it is optional. If the first three customers have provided some feedback, while Catherine has said she didn’t want to leave any, does that mean this value is null? No, because Catherine didn’t want to provide any feedback, so we could mark her response as “NONE”.

If she hasn’t replied yet, only then our value would have been null.

13. What is a primary key and a foreign key?

A primary key is a column (or a set of columns) whose value exists and is unique for every record in a table. It’s important to know that each table can have one and only one primary key.

Therefore, you can think of a primary key as the field (or group of fields) that identifies the content of a table in a unique way. For this reason, the primary keys are also called the unique identifiers of a table.

Another crucial feature of primary keys is they cannot contain null values.

This means, in an example with a single-column primary key, there must always be a value inserted in the rows under this column. You cannot leave it blank.

One last remark about primary keys - not all tables you work with will have a primary key, although almost all tables in any database will have a single-column or a multi-column primary key.

A foreign key, instead, is a column (or a set of columns) that references a column (most often the primary key) of another table. Foreign keys can be called identifiers, too, but they identify the relationships between tables, not the tables themselves.

In the relational schemas form of representation, relations between tables are expressed in the following way. The column name that designates the logical match is a foreign key in one table, and connects to a corresponding column from another table. Often, the relationship goes from a foreign key to a primary key, but in more advanced circumstances, this will not be the case. To catch the relations on which a database is built, we should always look for the foreign keys, as they show us where the relations are.

Author’s note: for a more in-depth explanation, check out our tutorials on SQL Primary Key and SQL Foreign Key.

14. Describe a parent-child relationship in the context of a relational database.

Remember the function of a foreign key (see above)? It points to a column of another table and, thus, links the two tables. It is a field or collection of fields from one table - the child table, and it refers to a column in another table, called the parent table. Usually, the column or the set of columns in the parent table is the primary key of that table. (The child table can also be called the referencing table, and the parent table can be called the referenced table.)

By the way, if you’re finding this answer useful, consider sharing this article, so others can benefit from it, too. Helping fellow aspiring data scientists reach their goals is one of the things that make the data science community special.

15. Given a table with duplicate data, how would you extract only specific rows based on business requirements provided?

In most cases, the tools form the Data Manipulation Language (DML) will allow you to do that. Usually, you could either use a SELECT DISTINCT statement to select distinct rows only or apply a GROUP BY clause to a join to filter the data in the desired way.

16. What are the essential Python libraries used for machine and deep learning?

The key Python libraries for machine and deep learning include TensorFlow and Keras for building and training advanced neural networks, scikit-learn for various machine learning algorithms like classification, regression, clustering, and pre-processing, and PyTorch for its dynamic computational graph that allows flexibility in building complex architectures.

17. What is the difference between WHERE and HAVING clause in SQL?

The WHERE clause is used to filter rows before making any groupings. The HAVING clause is used to filter groups in GROUP BY clauses. WHERE applies to individual rows, while HAVING applies to aggregate functions.

18. What are interpolation and extrapolation?

Interpolation involves estimating unknown values within a range of known data points. Conversely, extrapolation concerns projecting the existing data points to estimate values beyond the known range.

Behavioral Questions

This type of data scientist interview questions has become increasingly important in the hiring process. The reason is they help employers assess if your personality and motivations make you the right fit for the job. Most of them are centered around your behavior in similar past work situations.

19. What motivates you about this position?

By asking this question, the recruiter wants to understand whether you are excited about the new opportunity that lies ahead of you. Your enthusiasm, of course, is highly correlated with the amount of effort you will put once the job is offered.

A motivated person would try to be proactive and create a positive working environment, which is precisely what every company needs. The real question isn’t whether you should say that you are motivated. Of course, you should. You need to think of a way that would best show that you are genuinely interested in the position under consideration. There are a lot of different things that can motivate you:

  • The learning opportunities that you will have on the job
  • Future growth prospects
  • You like the team that you will be inserted in (if you have met them)
  • You share the company’s values/mission
  • The company operates in a dynamic, ever-changing industry
  • The company’s prestige

Of course, remuneration is one of the main motivators for almost all people. However, talking about money is not a good idea at this point in the selection process. Instead, focus on some of the aspects that we listed above and customize them to the specific position that you are applying for.

What you say while answering this question is not the only important thing. Your interviewer will be eager to see that all signs point in the same direction. Try to show that you are excited through your voice, posture and body language. This can be the critical difference that will determine whether or not you will be selected.

20. Give me an example of a time when you had to go the extra mile?

The only way to do great work is to love what you do

- Steve Jobs

Going the extra-mile is rarely a one-time act. More often, it is an ingrained habit. You need to properly explain to your recruiter that you love the idea of working that job. Also, explain how you want to be excellent at it. Your internal drive towards excellence is what motivates you to go the extra mile – to do the things that you are not expected to do:

  • Study during the weekends
  • Stay late in the office
  • Striving for excellence constantly

If the job you are interviewing for is what you chose for your life, then you want to be excellent at it. Striving to achieve excellent performance is important. It means that you want to put quality in your work and create value for the company. Internal drive is probably the best reason to go the extra mile; you are willing to do what is necessary in order to be good at what you do.

An example of such a situation:

During your previous internship experience, you put in a lot of extra effort in order to show that your tutor who also recruited you did not make a mistake. You stayed late and studied during the weekend because you wanted to improve your skills and to do it faster. The positive impression that you left with your work led to an excellent valuation and very positive feedback about your willingness to learn.

21. Can you tell me a time when you were able to build motivation in your co-workers?

This question aims to assess whether you are a good leader and a positive influence at your workplace. Hiring managers look for people who are motivated themselves and are able to transmit their drive to their co-workers. Strong motivation makes for excellent results.

In order to be able to motivate someone, you have to fully understand the person that you are approaching. What is it that they currently need in order to be excited about a project? Perhaps they need:

  • One-on-one coaching
  • Interesting tasks
  • More complicated tasks
  • Responsibility
  • Autonomy
  • Recognition
  • A positive perspective
Here's an example of such a situation:

During your previous internship within the Corporate Finance department of a large firm, you were asked to prepare a Valuation model. There was another intern who was assigned to work with you. Given that she had less experience with Financial Modeling, she could only help you with minor data entry and consistency checks.

You noticed that this was not particularly stimulating for her, as this is something she already knew how to do and she really wanted to learn how to create the model itself. You realized that she would be more motivated to do her part if she was given the opportunity to learn as well. That is why you asked her whether she would like to sit next to you while you work on the model, so that the two of you can comment on what you are doing together. This greatly motivated her and she came up with some valuable suggestions when you had to prepare a presentation that summarizes the model that you prepared.

22. How do you handle a challenge?

First of all, you want to give the impression that you are someone who welcomes a challenge. You are a person who is willing to leave his/her comfort zone and embrace challenging situations. You learn the most when you are put in a difficult situation. And this is certainly something that the Hiring Manager is looking to hear from you. The second part of the question is how you actually handle a challenge. Do you have a structured approach? Are you a person who builds a plan of action and then sticks to it? It would be best if you could provide an example of your past experience. A story showing that you:

  • understood the issue
  • created a plan of action
  • executed the plan of action successfully
An example of such a situation:

Let’s say that you were admitted to a Master's in Economics. A really challenging situation arose because you knew that most of the people in the class had already studied Finance and Econometrics, while you concentrated on Leadership courses. There was a significant gap between your skills and those of others. You realized that. You also realized that the only way to address the issue was to start with the very basics and fill the knowledge gap step by step; a very long process that required significant efforts on your end. An encouraging sign was that the results at the end of the first semester showed that you reduced the gap significantly and were heading in the right direction. By the end of the second semester, your GPA was slightly higher than the average for the class.

23. What is your greatest weakness?

The problem with this question is that you are being asked about your shortcomings, while you are doing an interview and you want to make a good impression. Make sure that you don’t choose something that can impede you from being great at the job you are interviewing for. For example, if you are interviewing for a controller or a financial analyst, it is OK to say that you do not like to speak in public. However, if you are applying for a consulting or an investment bank job you should not say that, because public speaking can be essential for those professions.

Choose a weakness that you can turn into a positive. “I am usually not good at…but I am making an effort to improve that”. Avoid cliché answers like “I work too hard” and “I am a perfectionist”. No one is perfect – that is why you need to indicate a weakness when you are asked about one. This shows that you are self-aware and have listened to feedback.

Here's an example:

The tutor at my previous internship gave me some interesting feedback: “Don’t try to do too much.” I remembered that and had a chance to reflect on it, once the internship was over. He was right; I tried to do too much. I was eager to prove myself and implement everything that I learned in university so I could perform great. Trying to implement complex models and “doing too much” is something that I need to control in the future.

This experience allowed me to understand that greatness is a lot of small things done well.

Therefore, I decided that the next time when I am facing a similar situation, I will focus on my own duties and will make sure that I do everything that is expected of me well, instead of trying to invent the next formula of relativity.

24. How do you handle working with numbers/clients/multiple tasks/stress?

Each of these aspects can be really important for a given position and the Hiring Manager will want to make sure that you are the right person that he/she is looking for. Try to figure out the most important characteristics of the job that you are applying for. Are you expected to do multitasking? What part of your overall responsibilities would be related to financial figures? Are you going to interact with many people?

Based on your findings, you will know what to expect. Prepare good examples from your past that can serve as proof of your statements.

25. What would you do if the priorities of an important project you were working on suddenly changed?

It’s a very broad question, isn’t it? Try answering by asking some questions that can guide you to the right answer:

  • Who changed the project’s priorities? Your boss? Clients? Suppliers? External Factors?
  • Why did they change priorities?

Try to understand the reason behind the decision and assess whether it is a valid one. Is there something that you can do about it?

If you believe that you can propose a solution, don’t be shy about contacting the responsible manager and sharing your idea.

Or, if you believe that the reason for shifting priorities is not valid, raise your concerns with Management.

Maybe there is nothing you can do about the decision. External factors that can’t be changed are the reason or your Boss says that despite your concerns, the decision to change priorities remains. In this case, create a course of action and make sure that everybody on your team is aligned with the new priorities. Schedule a reasonable deadline and think of the best way that you can achieve the new goals.

26. What would you do if someone at work resisted your ideas?

Again, open communication is the best way to approach this problem. First of all, you need to make sure that you are fully explaining your ideas. Perhaps you can try an alternative approach? You can provide practical examples or make a list of the pros and cons of your suggestion. Then you should try to understand your colleague’s point of view. What are the reasons behind his resistance? If his point is valid as well, think of an alternative approach together regarding the problem. Maybe you can create a hybrid solution that will include your ideas and will address his concerns.

27. Is there anything else that we should know about you?

Yes. The answer to this question is always “Yes”. There are many things that they should know about you. This question typically comes at the end of the interview and it is an opportunity to close in a strong fashion. There is no need to pass up on this extra opportunity that the interviewer has given you. Try to address some of the following points that did not come up during the interview:

  • Skills that are relevant for the job under consideration
  • Past experience that will help you to be successful at this job
  • Motivation to work for the company in the particular role that you are interviewing for
  • What is going to be your added value to the team that you will be placed in

One of the basic rules in sales is that you need to convince your client that he/she needs your product. This is a similar situation. Make a closing statement that will convince your interviewer that you are the right person that they are looking for.

Brainteasers

Brainteasers give the interviewer an overall idea of your logic and math abilities, critical thinking and creativity. But, above all, their goal is to examine how you apply all of these under pressure. So, here are a few questions and their answers that will help you hone your problem-solving skills. Or at least give you an understanding of the schema around which your answer should be organized.

25. Imagine you’re in a room with 3 light switches. In the next room, there are 3 light bulbs, each controlled by one of the switches. You have to find out which switch controls each bulb by checking the room just once. Keep in mind that all lights are initially off, and you can’t see into 1 room from the other. So, how can you figure out which switch is connected to which light bulb?

Let’s say we have switches 1, 2, and 3. What you can do is leave switch 1 off, turn switch 2 on for 5 minutes, and then turn it off. Then turn switch 3 on and leave it like that. Then you enter the room. Obviously, switch 3 controls the light bulb you left on. The bulb that is off but still warm, is controlled by switch 2. And switch one controls the light bulb you never turned on.

26. You want a work of art that was $400 but you can now buy at 25% off. How much is the promotion price?

It’s time for a quick calculation: What’s 75% off \$400? The answer is \$300. Of course, if you’re into numbers and like using shortcuts, don’t hesitate to think out loud.

27. Identify the next number in the following sequence: 2, 6, 12, 20, ….

The first number in the sequence is 2.

The second number is 6, which is obtained by summing the previous number (2) with the addend 4.

The third number in the sequence is 12, obtained by taking the sum of the previous number (6) with the addend from the previous step increased by 2. That is:

\[6+\left(4+2\right)=6+6=12\]

The fourth number is 20, calculated analogously by taking the sum of the previous number in the sequence and the addend from the last step increased by 2, namely:

\[12+\left(6+2\right)=12+8=20\]

If we continue this pattern—adding a number that increases by 2 with each step (4, 6, 8, ...) —the next addend would be 8 + 2 = 10. Therefore, to find the fifth number in the series, add 10 to the fourth number in the sequence: 20 +10 = 30.

\[20+\left(8+2\right)=20+10=30\]

So, the next number in the series is 30.

28. We can easily express the number 24 with three eights as follows: 8 + 8 + 8. Can you express 24 using other three identical numbers?

Example solutions:

\[22+2=24\]

\[3^3-3=24\]

Guesstimate

Guesstimate cases are a sort of a prelude to a full-blown business situation case. They show the data science interviewer how you approach problems and test both your judgment and numerical thinking.

29. How many square feet of pizza are eaten in the United States each month?

Let’s say there are roughly 300 million people in America, out of which 200 million eat pizza. Now, suppose the average pizza-eater has pizza twice a month and eats two slices at a time. That makes four slices per month. Тhe usual slice of pizza is about six inches at the base and 10 inches long. That means the slice is 30 square inches of pizza. Consequently, four slices of pizza would amount to 120 square inches. We know that one square foot equals 144 square inches, we can say that each pizza-eater consumes one square foot per month. And, as there are 200 million pizza-eaters in America, we can conclude that 200 million square feet of pizza are consumed in the US each month.

30. Estimate the total number of hours spent on social media by all users worldwide in a single day.

For this estimate, let's take the world’s population to be 8 billion people. Out of those, assume that people between the ages of 12 and 65 use social media which we can approximate to account for 70% of the population. Let’s remove 10% more to account for people who either don’t have access to social media or have decided to not use one.

This would total to around 4.5 billion people regularly using social media. Next, we need to estimate the average time an individual spends on social media daily. This can vary widely by region, age group, and other factors. Averaging all those factors out, we can assume the average person spends about 2.5 hours per day on social media. Now, we multiply the total number of users by the average time spent:

\[4.5\times{10}^9\times2.5\approx11\times{10}^9\]

Therefore, the estimated total number of hours spent on social media by all users worldwide in a single day, based on these assumptions, is 11 billion.

What can you expect from a data scientist interview process?

A phone interview followed by an in-person interview or the other way around? One or multiple interviewers? One thing is sure – different companies have different approaches. With that in mind, here are 3 real processes for data science positions. We believe they will give you a good initial idea of what happens behind the curtains.

Microsoft

Usually, phone interviews that cover coding questions take place first, followed by 4-5 onsite interviews, often with 2 different teams. About half of them are data science-related questions (including theory, research, and models). The rest aim to test the candidate’s coding skills. Microsoft data scientist interview questions are often open-ended. So, the solutions really depend on your own interpretation.

And in contrast to other companies, data presentation and visualization questions take up a large part of the interview here.

As in other companies, you only reach the hiring manager if you have passed the interviews with the teams. From this point on, the decision is in the hands of the Hiring Manager. That said, patience is a virtue (or so they say). However, if you don’t hear from HR within a week, there is no harm in sending a polite follow-up email just to know where you stand.

Google

According to Gayle Laakmann, author of “Cracking the Coding Interview”, no matter what they tell you about how scary the Google interview process is, it’s mostly just stories. That said, Google’s technical interview process is pretty much standard. You go through one (or a few) phone screen interviews, followed by onsite interviews. The first phone screen centers around technical data scientist interview questions. The good news is, Google has its own guide for the technical part of the interviewing process (and you can check it out here). Regarding the onsite interviews – they’re conducted by 4 to 6 people. Each of them keeps their feedback confidential from the rest of the interviewers (minimizing bias and whatnot).

What’s specific about the Google interview, is that interviewers send their written feedback to a hiring committee.

The committee, on their part, makes a recommendation to the Google executives (which is, more often than not, approved). Keep in mind that the Google hiring process can take longer than you’ve hoped for. However, you should be proactive in the communication with HR and once again kindly ask for a status update once a week has passed.

Amazon

If you’re thinking of applying for a data scientist position at Amazon, here’s what you should know about their hiring process. First, there are 1 or 2 phone screens. They’re technical in nature, so you can expect some coding questions (writing code and reading it over the phone, for example). The interviewer might also inquire you extensively about technology to see what you’re in-the-know about. Next, there are 4 - 5 onsite interviews with 1 or 2 teams.

As expected, different teams focus on data scientist interview questions in different areas. Keep in mind that some of them are related to data models and data sets. So, you really need experience with those to get the solution right. What’s interesting is that each interviewer must first submit their own feedback in order to see the others’ evaluations. Once done, there’s a meeting where the interviewers agree on making the final hiring decision.

Hm, is there anything specific to be aware of? Yes – the Bar Raiser.

Bar Raisers have immense interviewing experience and hold the veto power in the hiring process. No one can overrule the bar raiser’s final decision, not even the hiring manager (talking about supreme power!). According to Amazon VP of Worldwide People Operations Ardine Williams, "One of our hiring principles is that anyone we bring in should raise the bar on our internal performance, which means that we're looking for someone who's better than half of the people currently working here at that level."

Amazon’s recruiters usually follow up promptly. However, guess what – if a week has passed and you’re still waiting for an answer, a friendly status-update email won’t hurt.

Some final words on data scientist interview questions…

Interviewing for a data scientist position can be a bit scary at first. So, just in case you’re still not as confident as you should be, as a final takeaway, remember the following:

  • Listen carefully to everything that the interviewer mentions in the questions, emphasize on clear explanation, and on your thought process;
  • Even if your explanations aren’t perfect and you need some assistance from the interviewer, that’s not a bad thing. In fact, it signals the interviewer you’re open to receiving help, can handle feedback and would probably be a solid team-player;
  • Communication (both verbal and non-verbal) is key – exude a positive attitude, demonstrate professionalism, and be confident in your abilities. Keep in mind your tone of voice and pacing, as well as your gestures. Your body language speaks volumes! That said, you can find more about the types of non-verbal communication and how to improve your body language in this Indeed article.
  • Learn from the process if you haven’t been successful. Discuss the challenging data scientist interview questions you couldn’t answer during the interview with a friend or colleague and try to find a solution. That will take off the edge and will make you feel more at ease the next time you encounter a similar problem.

Enthusiastic to explore more data scientist interview questions?

Read our comprehensive article Data Science Interview Questions And Answers. and take our courses Starting a Career in Data Science: Project Portfolio, Resume, and Interview Process and SQL for Data Science Interviews.

In case you feel that you lack some of the fundamental skills required for the job, check out the all-around 365 Data Science Training. If you aren’t sure whether you want to turn your interest in data science into a full-fledged career, we also offer a free preview version of the Data Science Program. You’ll receive 12 hours of beginner to advanced content for free. It’s a great way to see if the program fits your goals and needs.

The 365 Team

The 365 Data Science team creates expert publications and learning resources on a wide range of topics, helping aspiring professionals improve their domain knowledge, acquire new skills, and make the first successful steps in their data science and analytics careers.

Top