Early Outcome Prediction of Software Projects using Software Defects and Machine Learning
Abstract
The goal of this research is to help software stakeholders predict early in the software project when or if the project is at risk of failure. If the decision makers can get an early notification of the outcome, they can make better choices on what they need to do to make the project successful. In this paper, we explore using the trend of defect totals as a function of the relative completion of the project. We collected the data from a software company who had multiple software projects that had a defined and consistent process and metric collection methods. We show how we used the defects totals from this data as features to a Support Vector Machine (SVM) to classify the project as successful or unsuccessful early in the project’s lifecycle. We present our technique and methodologies for developing the inputs for the proposed model and the results of testing. Further, we discuss the prediction model and the analysis of an SVM model. We then evaluate the labels from the company’s dataset to our prediction model and show that it demonstrates a prediction accuracy of 88.7% in a set of 13 projects.