The effects of data preprocessing on probability of default model fairness

Di Wu

doi:10.30574/wjaets.2024.12.2.0354

Di Wu ^*

Naveen Jindal School of Management, The University of Texas at Dallas, 800 W Campbell Rd Richardson Texas 75080 USA.

Research Article

World Journal of Advanced Engineering Technology and Sciences, 2024, 12(02), 872–878.

Article DOI: 10.30574/wjaets.2024.12.2.0354

DOI url: https://doi.org/10.30574/wjaets.2024.12.2.0354

Publication history

Received on 08 July 2024; revised on 20 August 2024; accepted on 22 August 2024

Abstract

In the context of financial credit risk evaluation, the fairness of machine learning models has become a critical concern, especially given the potential for biased predictions that disproportionately affect certain demographic groups. This study investigates the impact of data preprocessing, with a specific focus on Truncated Singular Value Decomposition (SVD), on the fairness and performance of probability of default models. Using a comprehensive dataset sourced from Kaggle, various preprocessing techniques, including SVD, were applied to assess their effect on model accuracy, discriminatory power, and fairness.
The findings reveal that while SVD effectively reduces the dimensionality of the data, it does not necessarily enhance the fairness of the models. Specifically, the application of SVD resulted in a deterioration in the model’s ability to correctly classify loan defaults, particularly for minority classes. This outcome suggests that critical information pertinent to fair predictions may be lost during the dimensionality reduction process. Furthermore, the analysis of fairness across different demographic groups, such as age and marital status, indicated that SVD did not contribute positively to reducing disparate impacts or balancing error rates.
These results underscore the complexities of using dimensionality reduction techniques in fair lending applications and highlight the need for more tailored approaches to preprocessing that prioritize both accuracy and fairness. Future research should explore alternative methods that preserve the integrity of sensitive information while enhancing the equitable performance of credit risk models.

Keywords

Data preprocessing; Machine learning; Probability of default; Credit risk; Fair lending; Safe AI

Download Article PDF

https://wjaets.com/sites/default/files/fulltext_pdf/WJAETS-2024-0354.pdf

Get Your e Certificate of Publication using below link

Download Certificate

Preview Article PDF

How to cite this article

Di Wu. The effects of data preprocessing on probability of default model fairness. World Journal of Advanced Engineering Technology and Sciences, 2024, 12(02), 872–878. Article DOI: https://doi.org/10.30574/wjaets.2024.12.2.0354

The effects of data preprocessing on probability of default model fairness

Di Wu ^*

Preview Article PDF

Get Certificates

Issue details

The effects of data preprocessing on probability of default model fairness

Di Wu *

Preview Article PDF

Get Certificates

Issue details

Di Wu ^*