Covering descriptive and inferential statistics, as well as hypothesis testing techniques and exercises, statistics puts the “scientist” in data scientist.








Course description

Statistics is the driving force in any quantitative career. It is the fundamental skill data scientists need to be able to understand and design statistical tests and analyses performed by modern software packages and programming languages. In this course, we start from the very basics of statistics and gradually build up your statistical thinking, enabling you to understand the more complex analyses carried out later in the program.


In this part of the course, we will discuss why you need to learn statistics, and which are the key skills you will acquire by taking the course.


Confidence Intervals

Here, you will learn how to calculate confidence intervals with known population and variance. We will introduce the Student T distribution, and you will learn how to work with smaller samples, as well as differences between two means (with dependent and independent samples). These tools are fundamental later on when we start applying each of these concepts to large datasets and use coding languages like Python and R. To reinforce what you have learned, we will wrap up this section with an easy-to-understand practical example.


Hypothesis testing

In this section, you will learn how to perform hypothesis testing, as well as the difference between a null and alternative hypothesis. We will discuss rejection and significance levels, and type I and type II errors. The lessons will teach you how to test for the mean when the population variance is known and unknown, as well as how to test for the mean when you are dealing with dependent and independent samples. You will also become familiar with the p-value. To consolidate the new knowledge, we will conclude with a practical example.