Project-based Intermediate-Level Python Class: Case Study on Credit Risk Modeling

In this course we will work with a realistic consumer loan portfolio dataset to model credit risk, emphasizing compliance with bank regulation, namely the Basel Accords. We will begin with a brief review of Python’s fundamental concepts. Using finance related exercises, we will review working with variables, logic, looping, and defining functions. Then we will import the data and spend time manipulating and cleaning the dataframes. We will work to preprocess both discrete and continuous variables so that they are ready for objective analysis.

Next, we will review the components required to calculate expected loss on a single loan and on a bank’s loan portfolio. First, we will develop a model for probability of default. To help us determine which variables have the most predictive power, we can visualize the weight of evidence and calculate the information value. We will use machine learning to run a logistic regression. Using the Scikit-learn library we will split the data into a training and a test dataset, create dummy variables, and run the logistic regression to predict probability of default.

Later, we will evaluate the accuracy of our model using an ROC, among other metrics. Using our model results, we will generate a scorecard, from which we can determine a cutoff credit score. Taking a similar approach using both linear and logistic regression, we will develop a loss given default model and an exposure at default model.

Ultimately, we will combine the results of these three models to determine the portfolio’s expected loss. We can compare this portfolio loss to the capital held for regulation compliance to fine-tune the bank’s lending credit score cutoff.

Course Objectives

By the end of the course, participants will be able to:

Work with variables, logic, looping, and defining functions in Python
Use popular commands and tools in many important libraries
NumPy, Pandas, MatPlotLib, Seaborn, Pickle, and Scikit-learn
Import and manipulate enormous datasets efficiently
Preprocess the dataset so it can be used in objective models
Perform calculations quickly using Python’s ability to apply vectorization
Define functions and classes to expedite analysis
Run multiple linear regression and interpret results
Use machine learning to run logistic regression and interpret results
Develop three models: probability of default, loss given default model, and exposure at default
Evaluate the predictive power of various independent variables, visualize their weight of evidence, and fine tune the models
Import and Export models using Pickle
Evaluate the portfolio’s expected loss

Suggested Prerequisites: Introduction to Python

Program Level: Intermediate

Advance Preparation: Participants must have Python and Spyder which come with the Anaconda distribution: https://www.anaconda.com/download/. If Python was installed via Anaconda Distribution, you will already have many of the packages that we need for this course.

Computers and Financial Calculators: Computers and Microsoft Excel

Recommended CPE Credits: 12