TY - JOUR
T1 - Machine learning techniques for software vulnerability prediction
T2 - a comparative study
AU - Jabeen, Gul
AU - Rahim, Sabit
AU - Afzal, Wasif
AU - Khan, Dawar
AU - Khan, Aftab Ahmed
AU - Hussain, Zahid
AU - Bibi, Tehmina
N1 - Funding Information:
This work has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No. 957212 and the ECSEL Joint Undertaking (JU) under grant agreement No 101007350. D. Khan was supported in part by NSFC (No.62150410433), Shenzhen Basic Research Program (JCYJ20180507182222355) and CAS-PIFI (No. 2020PT0013 ). We are thankful to the anonymous reviewers for their valuable comments and suggestions.
Publisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2022/12
Y1 - 2022/12
N2 - Software vulnerabilities represent a major cause of security problems. Various vulnerability discovery models (VDMs) attempt to model the rate at which the vulnerabilities are discovered in a software. Although several VDMs have been proposed, not all of them are universally applicable. Also most of them seldom give accurate predictive results for every type of vulnerability dataset. The use of machine learning (ML) techniques has generally found success in a wide range of predictive tasks. Thus, in this paper, we conducted an empirical study on applying some well-known machine learning (ML) techniques as well as statistical techniques to predict the software vulnerabilities on a variety of datasets. The following ML techniques have been evaluated: cascade-forward back propagation neural network, feed-forward back propagation neural network, adaptive-neuro fuzzy inference system, multi-layer perceptron, support vector machine, bagging, M5Rrule, M5P and reduced error pruning tree. The following statistical techniques have been evaluated: Alhazmi-Malaiya model, linear regression and logistic regression model. The applicability of the techniques is examined using two separate approaches: goodness-of-fit to see how well the model tracks the data, and prediction capability using different criteria. It is observed that ML techniques show remarkable improvement in predicting the software vulnerabilities than the statistical vulnerability prediction models.
AB - Software vulnerabilities represent a major cause of security problems. Various vulnerability discovery models (VDMs) attempt to model the rate at which the vulnerabilities are discovered in a software. Although several VDMs have been proposed, not all of them are universally applicable. Also most of them seldom give accurate predictive results for every type of vulnerability dataset. The use of machine learning (ML) techniques has generally found success in a wide range of predictive tasks. Thus, in this paper, we conducted an empirical study on applying some well-known machine learning (ML) techniques as well as statistical techniques to predict the software vulnerabilities on a variety of datasets. The following ML techniques have been evaluated: cascade-forward back propagation neural network, feed-forward back propagation neural network, adaptive-neuro fuzzy inference system, multi-layer perceptron, support vector machine, bagging, M5Rrule, M5P and reduced error pruning tree. The following statistical techniques have been evaluated: Alhazmi-Malaiya model, linear regression and logistic regression model. The applicability of the techniques is examined using two separate approaches: goodness-of-fit to see how well the model tracks the data, and prediction capability using different criteria. It is observed that ML techniques show remarkable improvement in predicting the software vulnerabilities than the statistical vulnerability prediction models.
KW - Machine learning
KW - Prediction models
KW - Software vulnerability
UR - http://www.scopus.com/inward/record.url?scp=85127586282&partnerID=8YFLogxK
U2 - 10.1007/s10489-022-03350-5
DO - 10.1007/s10489-022-03350-5
M3 - Article
AN - SCOPUS:85127586282
SN - 0924-669X
VL - 52
SP - 17614
EP - 17635
JO - Applied Intelligence
JF - Applied Intelligence
IS - 15
ER -