Data Analyst Track
As a data analyst, your role will be to uncover hidden value in a company’s data, then use visualization techniques to portray this value to company executives. The appropriate tools and techniques for a data analyst are a mixture of advanced data manipulation and statistical modelling. You will have a deep statistical knowledge and superior programming skills, demonstrating your ability in handling data.
What’s included and why
Intro to data and data science
Working with data is an essential part of maintaining a healthy business. This course will introduce you to the field of data science and help you understand the various processes and distinguish between terms such as: ‘traditional data’, ‘big data’, ‘business intelligence’, ‘business analytics’, ‘data analytics’, ‘data science’, and ‘machine learning’.
More info goes here
Section 1: Introduction
We will start with an introductory lecture about the 365 Data Science program. We will discuss the best way to approach our trainings and how to take our courses in a way that will position you well for a data scientist career.
Section 2: The different data science fields
For a novice, the data science field can be rather confusing. It takes a while to make sense of all the buzz words and different areas of data science. Do not worry, as we will make this process easier and much faster for you. In some of our first lessons, you will learn how to distinguish between Business analytics, Data analytics, Business Intelligence, Machine Learning, and Artificial Intelligence. With this knowledge we point out the place of data science as of today. The specially designed infographic we will discuss in the lessons makes everything clearer.
Section 3: The relationship between different data science fields
In this chapter, you will learn how data science fields relate to each other and which ones leverage on:
- traditional and big data
- business intelligence
- traditional data science methods and machine learning
Section 4: What is the purpose of each data science field
It is one thing to learn which are the various data science disciplines, but a whole different story to be able to tell what each discipline is used for in practice. This is really valuable for you as it will allow you to gain an idea of the practical application of the different methods you will learn later on in our program.
Section 5: Common data science techniques
There are different ways to approach Traditional data, Big data, Business Intelligence, Traditional data science methods, and Machine learning. In this part of the course, we will introduce you to some of the most common techniques to do that, and we will provide several practical examples that will make things easier and more relatable.
Section 6: Common data science tools
Before we dive in to studying the different types of tools used in data science, we will provide a quick overview for you, so you can have a good idea of why we are studying different tools and how they relate with each other. This will greatly facilitate your learning process as you will already know what to expect and for what tasks you will need a tool exactly.
Section 7: Data science career paths
As with most professions, there are different career paths you can embark upon. In this chapter, we will discuss several job positions related to the fields of data and data science.
Section 8: Dispelling common misconceptions
Finally, we will conclude our Intro to Data and Data Science training with a few lessons dispelling the most common misconceptions about the data science field.
Statistics is the driving force in any quantitative career. It is the fundamental skill BI analysts need to be able to understand and design statistical tests and analyses performed by modern software packages and programming languages. We will start from the very basics and will gradually build up your skills allowing you to understand more complex analyses carried out later.
More info goes here
Section 1: Introduction
In this introductory part of the course, we will discuss why you need to learn Statistics, and which are the key skills you will acquire by taking the course.
Section 2: Fundamentals of descriptive statistics
Understand the basic features of data. There are different types of data and levels of measurement. After you complete this section, you will be able to distinguish between them and will know the difference between categorical and numerical values. All of this will help you when calculating the measures of central tendency (mean, median, and mode), and dispersion indicators such as variance, standard deviation, as well as measures of relationship between variables like covariance and correlation. To reinforce what you have learned, we will wrap up this section with an easy to understand practical example.
Section 3: Fundamentals of inferential statistics
One of the core topics you will find in every Statistics text book is about distributions. In this part of the course, you will learn what a distribution is and what characterizes the normal distribution. We will introduce you to the central limit theorem and to the concept of standard error. Pretty soon you will be able to calculate confidence intervals with known population and variance. And once we introduce the Student T distribution, you learn how to work with smaller samples, as well as differences between two means (with dependent and independent samples). All of these tools will be fundamental later on when we start applying each of these concepts to large datasets and use coding languages like Python and R. To reinforce what you have learned, we will wrap up this section with an easy to understand practical example once again.
Section 4: Hypothesis testing
Confirming and rejecting hypothesis with a reasonable degree of certainty is a practical and easy to apply method when dealing with uncertainty. In this section, you will learn how to perform hypothesis testing and what is the difference between a null and alternative hypothesis, as well as rejection and significance level, type I and type II errors. The lessons will teach you how to test for the mean when the population variance is known and unknown, as well as how to test for the mean when you are dealing with dependent and independent samples. We should not forget to mention that this is the part of the course when you will become familiar with the p-value, a key measure when dealing with advanced models. Similar to previous sections, we will conclude with a practical example, to make use of our new knowledge.
Section 5: Regression analysis
Regression analysis is arguably the most common method of prediction. It is used to describe the causal relationship between variables. We will start by introducing the linear regression model and will explain the difference between correlation and regression. Then we will be ready to introduce concepts like regression tables, R-squared, adjusted R-squared, ordinary least squares, and more. When we use multiple independent variables, we use a multiple linear regression. Once you have learned how to carry out simple and multiple linear regressions, we will introduce the five regression assumptions (linearity, no endogeneity, normality and homoscedasticity, no autocorrelation, and no multicollinearity). Naturally, we will put our newly acquired regression knowledge to use in a practical example which concludes the course.
Microsoft Excel is the #1 productivity software in the world. A huge amount of data comes in a spreadsheet format, so an analyst needs Excel in their arsenal. This course will teach you all the Excel skills you need to perform multi-layered calculations, create charts, manipulate data, lookup functions, and more!
More info goes here
Section 1: Course Introduction
In this introductory part of the course, we will discuss why you need to learn Excel, and which key skills you will acquire by taking the course.
Section 2: A quick introduction to the basics of Excel
This section is fundamental for those of you who have never used Excel. We will start from the very basics: introducing the Excel ribbon, learning how to insert (and delete) rows and columns, how to perform data entry tasks, and how to format worksheets professionally. In addition, you will create your first formulas and functions, and cut, copy, and paste values for the first time.
Section 3: Excel useful tips & tools
Once you are familiar with the basic operations in Excel, it will be time to learn Excel best practices and learn how to navigate spreadsheets professionally. In no time you will know how to apply fast scrolling, use keyboard shortcuts, format sheets professionally, fix cell references, use named ranges, apply custom cell formats, and much more.
Section 4: Excel functions
Excel is one of the most popular productivity tools the business world has ever seen. The main reason for this is Excel functions. It is time for you to learn how to use Excel functions like a true professional. We will start with some easier examples (SUM, COUNT, AVERAGE, IF, MAX, MIN, VLOOKUP, HLOOKUP), and gradually introduce more advanced (and more powerful) functions such as SUMIF, SUMIFS, COUNTIF, COUNTIFS, INDEX, MATCH, INDEX & MATCH, etc.
Section 5: Excel charts
One of the strongest features of Microsoft Excel, besides multi-layered calculations, is that it allows you to visualize data. Here you will learn how to insert and format different types of charts that will help you make sense of numbers and figure out their trend.
Section 6: Practical exercise – Build a P&L from scratch
It is one thing to learn how to work with Excel’s most important tools, but it is even better to apply these techniques in a practical exercise. This is what we will do here. The “Build a P&L from scratch” exercise allows you to see how everything you have learned so far can be put into practice.
SQL is one of the fundamental programming languages you need to learn to work with databases. When you are a business intelligence analyst in a company and you need data to perform your analysis, you usually have two options: extract it on your own or contact the IT team. Of course, the first one is an extremely valuable skill to have. In this course, we will teach you everything you need to know in terms of database management and creating SQL queries.
More info goes here
Section 1: Introduction to databases, SQL, and MySQL
Whether you are working in business intelligence (BI), data science, database administration, or back-end development, you will have to retrieve information from a server storing large amounts of data. To achieve this, you need SQL. The relational database management system we chose for this course is MySQL. We did that because MySQL is open-source, reliable, and mature. In one of the videos of this section, we will provide you with step-by-step guidance when you install MySQL Server and MySQL Workbench. The introductory part of this course pays significant attention to database theory. You will learn the meaning of terms like database, data table, data entity, record, field, relation, and more.
Section 2: First steps in SQL
It is time to create your first database and make your first steps in SQL. In this section, we will introduce you to string, fixed- and floating-point, and other useful data types. You will learn how to create a database table and how to use such a table. Not only that, but we will also introduce the different types of constraints that can be assigned to tables (primary key, foreign key, unique key, default, not null, and other types of constraints)
Section 3: SQL best practices
There are many ways you can write your SQL code, but there are only a few that are considered professional. In this part of the course, we will teach you how to write professional code and how to adhere to professional best practices. To reinforce what you have learned, we will wrap up this section with an easy to understand practical example.
Section 4: Loading the ‘employees’ database
One of the best features of our SQL training is that it uses a real-life database – the “Employees” database. We will use it to manipulate data in MySQL in all lessons. In this chapter, you will download the SQL file and will run it in Workbench.
Section 5: Data manipulation in SQL: SELECT, INSERT, UPDATE, DELETE
Are you ready to learn some of the most frequently used tools in SQL? These are the SELECT, INSERT, UPDATE, and DELETE statements. We use these statements to extract, insert, update, and delete data from a database.
Section 6: MySQL Aggregate functions
Aggregate functions come in handy when we want to perform some arithmetic operations with the data in our database. The most commonly used aggregate functions in SQL are COUNT(), SUM(), MIN(), MAX(), and AVG().
Section 7: SQL Joins, subqueries, self joins, and views
Joins are one of the most powerful and frequently used tools in SQL. This is a tool you will need when combining the information from two or more tables. After completing this section, you will be able to use inner, left, right, and self joins. You will also learn how to write subqueries and views. The section includes a number of useful tips and tricks and aims to take your SQL skills to the next level.
Section 8: Stored routines
Stored routines are a set of SQL statements that have been pre-written and stored on a server allowing users to re-run them at a later stage. You will learn how to create your own stored procedures and functions.
Section 9: Advanced SQL Topics
In the last part of the training, you will learn advanced SQL topics like local variables, session variables, global variances, MySQL triggers, and MySQL indexes.
Python is one of the most used programming languages among data analysts. This course will show you the technical advantages it has over other programming languages and its modules for scientific computing which make it a preferred choice in the fields of finance, econometrics, economics, data science, and machine learning.
More info goes here
Section 1: Course Introduction
In this introductory part of the course, we will discuss what does the course cover, why you need to learn Python, and which is the best way to approach this training
Section 2: Introduction to programming with Python
An introductory section in which we will introduce you to the concept of programming and will talk about some of Python’s key features (it is an open-source, general-purpose, high-level language). We will show you how to install the Jupyter Notebook (the environment we will use to code in Python) and will introduce you to its interface and dashboard.
Section 3: Python variables and data types
This is where you will start coding and learn one of the most fundamental concepts in programming – working with variables
Section 4: Basic Python syntax
If you want to master Python programming, there is no way around learning basic Python syntax operators first. In this section, we will cover the double equality sign, reassigning of values, adding comments, line continuation, indexing elements, arithmetic operators, comparison operators, logical operators, and identity operators
Section 5: Conditional statements
Conditional statements are the bread and butter of programming. Here you will start creating your own IF, ELSE, and ELIF statements
Section 6: Python functions
Python functions are another invaluable tool for programmers. They allow you to carry out pre-defined or specifically designed operations that manipulate the data you are working with and bring it one step closer to representing a meaningful output.
Section 7: Python sequences
Sequences are one of the main building blocks of computer programming. A sequence helps you store and organize different values you are working with. We will teach you how to work with lists, list slicing, tuples, and dictionaries.
Section 8: Using iterations in Python
Iterations are a programming technique which allows you to execute certain code repeatedly. This is one of the instruments letting you to automate repeated tasks and benefit from one of its main strong points.
Section 9: Advanced Python tools
In this part of our training, you will learn about object-oriented programming, different modules and packages, the standard library, how to import modules in Python, how to work with arrays and organize data in Python. All of these lessons will significantly enhance the Python knowledge you have acquired up to this point. Once you complete this section you’ll be ready to move ahead with our program and see how Python is be used in combination with SQL and Tableau.
R is a programming language that has been specifically designed for statistics and graphics. Programming in R is a fast and effective way to perform advanced data analyses. This course will inform you how to use R and apply the statistical functions you will need as a data analyst.
More info goes here
Section 1: Introduction to R and how to get started
In this introductory part of the course, we will go for a walk in the R environment. First, we are going to install R and RStudio together. Then, we’ll dive straight into RStudio and learn about its interface, and how to make use of the main windows and tabs there. We will also talk about setting your working directory and getting additional help.
Section 2: The building blocks of R
In this section we will learn about:
- Objects and coercion rules in R
- Functions in R
- How to use R’s console
Not only that, by the end of the section you will have built your first very own function; it will be able to draw cards from a deck, so you can play your favourite board game even if you don’t have the physical cards in front of you.
Section 3: Vectors and vector operations
Now that we have covered the basics, in this section we are about to drill deeper into R’s most widely used object type – the vector. You will learn how to create vectors and how to perform vector arithmetic operations. You will also see how to index and access elements from a vector, and how vectors recycle. Then, you will see how to change the dimensions of a vector and create a two-dimensional object from it. That will be our nice little segue into matrices.
Section 4: Matrices
It is time to talk about matrices. You will learn how to create and rename matrices, and how to index and slice matrices. All of this will lay a super solid foundation for the big star of data analysis: the data frame. Not only that, but we will also talk about factors, which is related to the statistics part of the course. Finally, we will cover lists: R’s way of storing hierarchical data.
Section 5: Fundamentals of programming with R
In this section of the course, we will go through some of the fundamental tools you need to learn when programming with R (and many other programming languages). We will cover relational operators, logical operators, vectors, IF, ELSE, and different types of loops (for, while, and repeat) in R. Some of these topics will have already been introduced to you in our Python training, but here you will have the chance to reinforce what you have learned and see things with R in mind.
Section 6: Data frames
In this section, we will focus our attention on how to create and import data frames into R. How to quickly get a sense of your data frame by using the str() function, summary(), col-and row-names, and so on. We’ll learn about accessing individual elements of your data frame for further use. And about extending a data frame with either new observations or variables (or row and columns). Furthermore, we will talk about dealing with missing data because in real life that happens more often than we’d like. And we’ll discuss exporting data frames once we’re happy with their general state and ready to share them with the world.
Section 7: Manipulating data
At this point in our training it is time to learn about some heavy-duty data manipulation techniques that will, without a doubt, become indispensable companions to your daily work with data. We will be talking about data transformation with the infamous dplyr package. More specifically, how to filter(), arrange(), mutate(), and transmute() your data; as well as how to sample() fractions and fixed number of elements from it. You will also learn what tidy data is, why it is extremely important for the efficiency of your work to tidy your data sets in the most meaningful way, and how to achieve this by using the tidyr package. You will be tidying several messy real-life data sets by using the gather(), spread(), separate(), and unite() functions. Finally, the big surprise for this section… you will learn how to combine multiple operations in an intuitive way by using the pipe operator.
Section 8: Visualizing data
Plotting and graphing data is the most elegant way to understand your data and present your findings to others. In this section we are going to learn about the grammar of graphics and the seven layers that comprise a visualization. Then, we will jump straight into creating graphs and plots, with the ggplot2 package. Starting with the histogram, we will continue on to the bar chart, then onto the box and whiskers plot, and finally, the scatterplot. You will notice that with each new type of plot you will also be learning about a new layer or two, getting familiarized with ggplot2 and its inner workings in an incremental way.
Section 9: Exploratory data analysis
In this part of the course, we start applying R for statistical analysis. We are ready to discuss several exploratory data analysis topics:
- Population vs. sample
- Mean, median, and mode
- Variance, standard deviation, and the coefficient of variability
- Covariance and correlation
Section 10: Hypothesis testing in R
At this point, you are already familiar with hypothesis testing. We covered it in one of our earlier modules – Statistics. What we will do here is a natural continuation – you will learn how to carry out hypothesis testing in R.
Section 11: Regression analysis in R
Regression analysis is another topic we covered earlier in our program. As with hypothesis testing, this is a great opportunity to apply the theory you have learned previously in R.
Advanced Statistical Methods in Python
Advanced Statistical Methods builds upon the statistical knowledge you will already have gained by focusing on predictive modelling and entering multidimensional spaces which require an understanding of mathematical methods, transformations, and distributions. The course introduces these concepts as well as complex means of analysis such as clustering, factoring, Bayesian inference, and decision theory while also allowing you to exercise your Python programming skills.
More info goes here
Section 1: Introduction
In this introductory part of the course, we will discuss what the course covers, why you need to learn advanced statistics, what’s the differences are with machine learning, and how to get the most out of this training.
Section 2: Regression analysis
Regression analysis is a topic you are already familiar with. However, here we will extend what you learned in our Statistics training with some additional concepts and will apply all the theory in Python. This section will serve for two purposes 1) a useful refresher of regression and 2) a great way to reinforce what you have learned applying it in practice while coding.
Section 3: Logistic regression
Data scientists use logistic regressions when the dependent variable is binary (0 and 1, true and false, etc.). This type of data is encountered on a daily basis when working as a data scientist and here, we will get you prepared. You will learn how to build a logistic regression, how to understand tables, how to interpret the coefficients of a logistic regression, calculate the accuracy of the model and how to test. We will introduce under and overfitting and will teach you how to test your models.
Section 4: Cluster analysis
In this chapter, we will introduce another essential technique you will definitely need in your data science arsenal, Cluster analysis. This consists in dividing your data into separate groups based on an algorithm. Clustering is an amazing technique often employed in data science. But what’s more, often it makes much more sense to study patterns observed in a particular group rather than trying to find patterns in the entire dataset. We will provide several practical examples that will help you understand how to carry out cluster analysis and the difference between classification and clustering.
Section 5: Factor analysis
There is a difference between variables and factors that have an impact on an independent variable. In this part of our training, we will teach you how to isolate few factors from a set of variables and use them to explain the independent variable. We will learn how to reduce the dimensionality of problems in order to apply the methods we learned before. We will go through different techniques used for factor analysis, while finding its place in machine learning.