تجاوز إلى المحتوى الرئيسي

Reliable prediction of software defects using Shapley interpretable machine learning models

Author name : Abdelaziz Ahmed Abdelaziz Eldamarany
Publication Date : 2023-07-20
Journal Name : Egyptian Informatics Journal

Abstract

Predicting defect-prone software components can play a significant role in allocating relevant testing resources to fault-prone modules and hence increasing the business value of software projects. Most of the current software defect prediction studies utilize traditional supervised machine learning algorithms to predict defects in software applications. The software datasets utilized in such studies are imbalanced and therefore the reported results cannot be reliably used to judge their performance. Moreover, it is important to explain the output of machine learning models employed in fault-predication techniques to determine the contribution of each utilized feature to the model output. In this paper, we propose a new framework for predicting software defects utilizing eleven machine learning classifiers over twelve different datasets. For feature selection, we employ four different nature-inspired search algorithms, namely, particle swarm optimization, genetic algorithm, harmony algorithm, and ant colony optimization. Moreover, we make use of the synthetic minority oversampling technique (SMOTE) to address the problem of data imbalance. Furthermore, we utilize the Shapley additive explanation model for highlighting the highest determinative features. The obtained results demonstrate that gradient boosting, stochastic gradient boosting, decision trees, and categorical boosting outperform others tested model with over 90% accuracy and ROC-AUC. Additionally, we found that the ant colony optimization technique outperforms the other tested feature extraction techniques.

Keywords

Software Defect Prediction; Feature importance; Machine learning; Model interpretation; Shapley Additive Explanation

Publication Link

https://doi.org/10.1016/j.eij.2023.05.011

Block_researches_list_suggestions

Suggestions to read

HIDS-IoMT: A Deep Learning-Based Intelligent Intrusion Detection System for the Internet of Medical Things
Ahlem . Harchy Ep Berguiga
Generalized first approximation Matsumoto metric
AMR SOLIMAN MAHMOUD HASSAN
Structure–Performance Relationship of Novel Azo-Salicylaldehyde Disperse Dyes: Dyeing Optimization and Theoretical Insights
EBTSAM KHALEFAH H ALENEZY
“Synthesis and Characterization of SnO₂/α-Fe₂O₃, In₂O₃/α-Fe₂O₃, and ZnO/α-Fe₂O₃ Thin Films: Photocatalytic and Antibacterial Applications”
Asma Arfaoui
تواصل معنا