Back to overview

Automatic Feature Selection by Regularization to Improve Bug Prediction Accuracy

Type of publication Peer-reviewed
Publikationsform Proceedings (peer-reviewed)
Author Osman Haidar, Ghafari Mohammad, Nierstrasz Oscar,
Project Agile Software Analysis
Show all

Proceedings (peer-reviewed)

Page(s) 27 - 32
Title of proceedings 1st International Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE 2017)
DOI 10.1109/maltesque.2017.7882013

Open Access


Bug prediction has been a hot research topic for the past two decades, during which different machine learning models based on a variety of software metrics have been proposed. Feature selection is a technique that removes noisy and redundant features to improve the accuracy and generalizability of a prediction model. Although feature selection is important, it adds yet another step to the process of building a bug prediction model and increases its complexity. Recent advances in machine learning introduce embedded feature selection methods that allow a prediction model to carry out feature selection automatically as part of the training process. The effect of these methods on bug prediction is unknown. In this paper we study regularization as an embedded feature selection method in bug prediction models. Specifically, we study the impact of three regularization methods (Ridge, Lasso, and ElasticNet) on linear and Poisson Regression as bug predictors for five open source Java systems. Our results show that the three regularization methods reduce the prediction error of the regressors and improve their stability