Time Series Analysis with Python

Introducing you to the world of time series and exploring how to utilize Python in analyzing and modeling such data. We will also discuss volatility and making forecasts about the future.








Course description

In Data Science mainly relies on working with two types of data - cross-sectional and time series. This course will help you master the latter by introducing you to ARMA, Seasonal, Integrated, MAX and Volatility models as well as show you how to forecast them into the future.


In this short section, we’ll tell you a bit more of what the course is about, how its structured and what our goal is.


Setting Up the Environment

In this part of the course, we will explain to you how to set up Python 3 and then load up Jupyter. We’ll also show you what the Anaconda Prompt is and how we use it to download and import new modules.


Introduction to Time Series in Python

In this section of the course, we are going to learn what makes a dataset a time series, and discuss what separates it from cross-sectional data. We’ll introduce the appropriate mathematical notation for such data before loading up a dataset and quickly examining it.


Creating a Time Series Object in Python

In this section of the course, we will go through the pre-processing aspects of working with time series. We’ll see how to interpret string text as dates and set these dates as indices of the data set. We’ll then set a fixed frequency and account for any missing values before splitting up the set for training and testing. In the appendix, we’ll show you how to import data directly from Yahoo Finance, so you can conduct your own analysis after completing the course.


Working with Time Series in Python

In this section of the course, we’ll examine and visualize some important types of time series, like white noise and a random walk. We’ll then discuss important concepts like stationarity, seasonality and autocorrelation, before exploring the ACF and PACF of a S&P 500’s prices.


Picking the Correct Model

In this short section, we’ll discuss the general rules of manual model selection. We will talk about which models we prefer, what we want to avoid and how to decide between models. We’ll talk about the Log-likelihood and information criterion as measurements of preference among similar models.


The ARMA Model

In this section, we’ll combine the two models we just examined – the AR and MA – into one: the ARMA. We’ll examine how they synergize and limit the drawbacks each model has on its own. We’ll then talk about the issues that come along with finding the best-fitting ARMA model and see how checking the model residuals can be beneficial in model selection.


The ARCH Model

In this section, we’ll talk about the idea of measuring volatility when we’re looking for stability in our investments. We’ll explain the multiple layers of ARCH models and how they differ from the ARMA family of models we just examined. We’ll spend some time discussing the vast functionality of the “arch_model” method and why it’s important to know the default values for many of its arguments.


The GARCH Model

In this section of the course, we’ll discuss the generalized version of the ARCH model, also known as the GARCH. We’ll explore why this model is more widely used, how it outperforms high-order ARCH models and why it looks so similar to the ARMA. We’ll then empirically test the known fact that the GARCH(1,1) is the best model for measuring the volatility of price returns.


Business Case

In this final part of the course, we will examine how a real-life event like the Dieselgate scandal can alter the trends in time series data.