Investigating Super Learner for Credit Risk Modeling In Mortgage Scenario


In the present industry, Credit risk analysis is very important for the organization’s business as well as its reputation in the market. In general, credit risk modelling is a method that lenders employ to assess the degree of credit risk involved in making a loan to a borrower.

The objective of this paper is to investigate super learners for credit risk modeling in mortgage scenarios, with the help of AutoML. Different super learners can be defined as ensemble models of different base models and investigated against the credit risk dataset. For explaining the prediction of the super learner, different result interpretation techniques i.e. SHapley Additive exPlanations (SHAP), Partial Dependence Plot (PDP), and Individual Conditional Expectation (ICE) have been used.

With the use of H2O AutoML and the credit risk data, we are using multiple machine learning models, which comprise 25 models including Ensemble machine learning models “StackEnsemble_BestOfFamily” and “StackEnsemble_AllOfFamily,” as well as base statistical machine learning models Deep Learning, Distributed Random Forest (DRF), eXtremely Randomize Tree (XRT), Gradient Boosting Machine (GBM), and General Linear Models (GLM), here define AUC is defined as a stopping metric. Out of that “StackEnsemble_BestOfFamily” is giving an AUC of 71.08%, and in base models, DRF is giving the highest accuracy of 88.02% and an AUC of 70.5%.

While in the interpretation, SHAP, PDP, and ICE techniques are giving a very good explanation for every individual result as well as the prediction of the whole dataset. Hence, with the help of AutoML techniques, multiple machine learning models are created in a short time without wasting time on data preparation, data exploration, feature engineering, model selection, model training, and hyperparameter tuning. With the help of SHAP, PDP, and ICE, any individual result can be explained to the customer or the end-user.

Keywords: AutoML, metalearner, StackEnsemble_BestOfFamily, StackEnsemble_AllOfFamily, Deep Learning, DRF, XRT, GBM, GLM, AUC, SHAP, PDP, and ICE.


Lalit Aggarwal

Lead, Fidelity National Financial

Sanjeev Jha

Co-Founder & MD at IntelliCredence Pvt. Ltd.

Dr. J B Simha

Professor and Chief Mentor - AI and CTO, ABIBA Systems

Dr. J B Simha excels in R&D, business intelligence, and analytics consulting, demonstrating his core competency. His expertise includes implementing expansive systems for telecom, BFSI, and manufacturing industries, specializing in business intelligence and analytics. His exceptional contributions have garnered recognition, earning him the prestigious title of one of India’s ‘Top 10 Most Prominent Data Science Academicians in 2019’ by Analytics India Magazine.

Leave a Reply

Your email address will not be published. Required fields are marked *