Exploring Consistent Feature Selection for Software Fault Prediction: An XAI-Based Model-Agnostic Approach

Khan, Adam and Ali, Asad and Khan, Jahangir and Ullah, Fasee and Faheem, Muhammad (2025) Exploring Consistent Feature Selection for Software Fault Prediction: An XAI-Based Model-Agnostic Approach. IEEE Access, 13. 75489 – 75520. ISSN 21693536

Full text not available from this repository.
Official URL: https://www.scopus.com/inward/record.uri?eid=2-s2....

Abstract

Numerous feature selection (FS) techniques have been widely applied in Software Engineering (SE) to improve the predictive performance of machine learning (ML) models. However, the consistency of these FS techniques, i.e., their ability to select stable features under various data changes, remains underexplored. While previous studies have examined the stability of traditional FS methods (e.g., Information Gain, Genetic Search), their findings are limited in scope. With the increasing use of eXplainable Artificial Intelligence (XAI) in SE, it is essential to assess the consistency of model-agnostic FS techniques to ensure their reliability in dynamic learning environments. In this study we evaluated the consistency of two prominent XAI-based techniques, Permutation Feature Importance (PFI) and SHapley Additive exPlanations (SHAP), across five ML models: Linear Regression (LR), Multi-layer Perceptron (MLP), Random Forest (RF), Decision Trees (DT), and Support Vector Machines (SVM). Experiments are conducted on six Software Fault Prediction (SFP) datasets using various validation strategies (e.g., 3-fold cross-validation, bootstrap), normalization and dataset modifications. The findings of the study reveal that model-agnostic FS techniques exhibit higher consistency than traditional techniques across all scenarios. Under validation-based changes, SHAP with SVM and DT achieves better average consistency (100), while MLP records the lowest (74.27). For PFI, LR, DT, and SVM also reach 100 consistency, with MLP again being the lowest (44.03). In data modification scenarios, SHAP with MLP shows the highest consistency (76.20), whereas SVM performs the lowest (70.98). Using PFI, RF achieves the highest (77.24), and SVM the lowest (62.84). Overall, SHAP outperforms PFI across most conditions, particularly under 5-fold CV, bootstrap, and LOO CV, while PFI is more stable when new instances are added to the training set. These findings confirm that both SHAP and PFI offer better consistency than traditional FS techniques, underscoring their reliability for real-world SFP tasks. © 2013 IEEE.

Item Type: Article
Impact Factor: Cited by: 0
Uncontrolled Keywords: Adversarial machine learning; Computer software selection and evaluation; Contrastive Learning; Decision trees; Prediction models; Software reliability; Empirical studies; Explainable artificial intelligence; Feature consistency; Features selection; Model agnostic technique; Permutation feature importance; Selection techniques; Shapley; Shapley additive explanation; Software fault prediction; Support vector machines
Depositing User: Mr Ahmad Suhairi Mohamed Lazim
Date Deposited: 08 Jul 2025 16:28
Last Modified: 08 Jul 2025 16:28
URI: http://scholars.utp.edu.my/id/eprint/38928

Actions (login required)

View Item
View Item