Credit Risk Modeling in Python
Credit risk modeling is the place where data science and fintech meet. It is one of the most important activities conducted in a bank and the one with the most attention since the recession. This course is the only comprehensive credit risk modeling course in Python available right now. It shows the complete credit risk modeling picture, from preprocessing, through probability of default (PD), loss given default (LGD) and exposure at default (EAD) modeling, and finally finishing off with calculating expected loss (EL).
Sign up to
preview the program
Create a free account and start learning data science today.create free account
Our graduates work at exciting places:
We start by explaining why credit risk is important for financial institutions. We also define ground 0 terms, such as expected loss, probability of default, loss given default and exposure at default.
Our example focuses on consumer loans. Since there are more than 100 potential features, we've devoted a complete section to explain why some features are chosen over others.
Each raw datasets has its drawbacks. While most preprocessing is model specific, in some cases (like missing values imputation), we could generalize the data preparation.
PD model: data preparation
Once we have completed all general preprocessing, we dive into model-specific preprocessing. We employ fine classing, coarse classing, weight of evidence and information value criterion to achieve the probability of default preprocessing. Conventionally, we should turn all variables into dummy indicators prior to modeling.
PD model estimation
Having set up all variables to be dummies, we estimate the probability of default. The most intuitive and widely accepted approach is to employ a logistic regression.
PD model validation (test)
Since each model overfits the training data, it is crucial to test the results on out-of-sample observations. Consequently, we find its accuracy, its area under the curve (AUC), the Gini coefficient and the Kolmogorov-Smirnov test.
Applying the PD model for decision making
In practice, banks don't really want a complicated Python-implemented model. Instead, they prefer a simple score-card which contains only yes and no questions that could be employed by any bank employee. In this section, we learn how to create one.
PD model monitoring
Model estimation is extremely important, but an often-neglected step is model maintenance. A common approach is to monitor the population stability over time using the population stability index (PSI) and revisit our model if needed.
LGD and EAD models
To calculate the final expected loss, we need three ingredients: probability of default (PD), loss given default (LGD) and exposure at default (EAD). In this section, we preprocess our data to be able to estimate the LGD and EAD models.
LGD models are often estimated using a beta regression. To keep the modeling part simpler, we employ a two-step regression model, which aims to simulate a beta regression. We combine the predictions from a logistic regression with those from a linear regression to estimate the loss given default.
The exposure at default (EAD) modeling is very similar to the LGD one. In this section, we take advantage of a linear regression to calculate EAD.
Calculating expected loss
After having calculated PD, LGD, and EAD, we reach the final step: computing expected loss (EL). This is also the number which is most interesting to C-level executives and is the finale of the credit risk modeling process.
This course is part of Module 4 of the 365 Data Science Program. The complete training consists of four modules, each building upon your knowledge from the previous one. Module 4 is focused on developing a specialized, industry-relevant skill set, and students are encouraged to complete Modules 1, 2, and 3 before they start this part of the training. Here, you will learn how to perform Credit Risk Modeling for banks, Customer Analytics for retail or other commercial companies, and Time Series Analysis for finance and stock data.See All Modules
Why Choose the 365 Data Science Program?
Real-life project and data. Solve them on your own computer as you would in the office.
Our expert instructors are happy to help. Post a question and get a personal answer by one of our instructors.
Earn a verifiable certificate after each completed course. Celebrate your successes and share your progress with your professional network!
Trust the other 500,000 students
The course is in-depth and is delivered at a steady pace with eye catching visuals. The instructors go through all the basics really well. They try not to over-simplify the material, you get a good sense аof how deep Data Science is in the course. Great job!!!
This course is amazing! After watching the video carefully and doing all the exercises, I am even capable of having discussions with Machine learning major Master’s students! High standard course with reasonable pricing.
Very clear and in-depth explanation of data science and how all the inter-related concepts apply in real life business environment. Absolutely great for beginners! Best data science course I have come across so far!
I would highly recommend the course to any beginner who wants to venture into the world of Data Science. The concepts are very well explained and there is an emphasis on practical application which really helps create a better understanding of the concepts.