Join the Community

22,146
Expert opinions
44,188
Total members
413
New members (last 30 days)
196
New opinions (last 30 days)
28,721
Total comments

12 lessons in predictive modelling for enhanced credit risk assessment

1 Like 1 1 comment

Modern credit risk management now leans significantly on predictive modelling, moving far beyond traditional approaches. As lending practices grow increasingly intricate, companies that adopt advanced AI and machine learning gain a sharper edge in understanding and managing risk.

Below, my colleague Nick Sime, Director of Fraud & Credit Risk Modelling, has shared essential tips from his experience. These insights are designed to help risk managers harness predictive modelling for smarter and more secure lending decisions.

1. Machine Learning models consistently outperform

Machine learning (ML) models reliably outperform traditional linear models when tested on independent samples. While the level of improvement may vary, ML models typically deliver a 10-15% uplift in Gini compared to newly developed logistic regression models. In credit risk terms, this can mean a potential 20% reduction in the bad rate at a given cut-off point.

2. Sample size matters

The larger the sample, the more ML models can identify complex, non-linear patterns, resulting in a performance boost. However, material improvements are still achievable even with smaller, low-default portfolios.

3. The optimal number of features: 40-60

Bureau data is becoming more complex as Credit Reference Agencies use additional data sources and derive trended variables. This presents a data reduction challenge to modellers. On top of this, creating models with an excessive number of variables creates an overhead for deployment and monitoring. Our experience shows that near-optimal performance within credit score developments can be obtained with 40-60 variables.  

4. Some overfitting is necessary

Overfitting is often viewed negatively, but ML models benefit from capturing subtle patterns. Applying strict overfitting controls may actually reduce a model’s predictive accuracy. However, our research indicates that overly overfit models deteriorate more quickly, making a balanced approach essential for long-term stability. In short, a carefully calibrated approach is needed to optimise performance in a live environment.

5. Explainability constraints are not a barrier 

To support model explainability, monotonicity and ranking constraints are applied ‘up front’ in the design of our models. This ensures that the marginal impact of input variables is consistent with business expectations. While some fear this may reduce performance, we find that it has negligible, if any, adverse impact. In fact, it can even benefit model stability over time.

6. Stability over time

Despite their complexity, ML models can demonstrate impressive stability. Our long-term analysis shows that Deep Learning models tend to degrade at a slower rate over time compared to traditional logistic regression models.

7. One & done (Goodbye to segmented models)

In traditional modelling, segmented models are often used to capture non-linear relationships. However, ML models inherently detect these patterns, making segmented models largely unnecessary in most situations.

8.  Reject inference needs special care

Scorecard developers will typically create a known good bad (KGB) model, an accept reject (AR) model, applying negative assumptions to rejects to create a dataset to build a final model that removes selection bias. ML models are intelligent and can effectively reverse engineer the inference for the declined cases in the sample, meaning the final model predictions for known cases are very similar to the KGB model negating the benefit of the inference process.

9.  Cross-learning (More is more)

Traditional scorecard development places strong emphasis on aligning development samples with future expectations. However, we’ve found that this isn’t always the optimal approach for advanced models. ML models can effectively leverage adjacent data sources, resulting in more robust and predictive models.

10.  Hyper-parameter (Avoid complication)

Hyperparameter tuning shapes both the structure of an ML model and its learning process. While a grid search is commonly used—requiring a model estimation for each hyperparameter combination—this approach can be resource-intensive, often yielding similar results across iterations. We recommend a Bayesian approach, which streamlines the process and more efficiently identifies optimal settings.

11.  Continue to monitor

Monitoring is essential to detect any stability issues and ensure optimal performance. With the greater number of inputs in ML models, dashboards can be invaluable for pinpointing areas that may need adjustment. Whilst monitoring will give you a strong indication your model is sub-optimal it will not tell you if it is optimal.

12.  Domain knowledge is essential

While automation in model development is possible, domain expertise remains crucial. Involving experienced credit practitioners ensures that the model inputs are sensible and aligned with business needs, avoiding features that may be counter-intuitive or problematic.

You want to look for a software solution that can access cutting-edge neural network models without needing to code.

Whilst many of the early adopters of ML models were agile Fintechs, traditional banks and lenders are now showing increased interest. In a market where aggregators and brokers play such a key role, the alignment of risk and price is essential. Lenders with the most powerful models have a clear competitive advantage.

 

External

This content is provided by an external author without editing by Finextra. It expresses the views and opinions of the author.

Join the Community

22,146
Expert opinions
44,188
Total members
413
New members (last 30 days)
196
New opinions (last 30 days)
28,721
Total comments

Now Hiring