Predicting defects on the line level, article review – SE metrics (Software Engineering)

A lot has been written about defect prediction, and I’m pretty sure that a lot will be written. It’s one of the research areas which is quite cool to work with because it provides researchers with quite quick results and is relatively quantitative in its nature.

One could also say that this is a holy grail in software development – to predict a location of a defect and fix it before it becomes a problem. It’s a good goal, but it is also a goal that is more like quicksand than a gravel road. Well, for one, not all defects are easy to recognize. Some are not even certain to be defects – sometimes it is not clear how to interpret a requirement, so it’s not easy to say if a piece of code is implementing it correctly or not.

In this paper, the authors have done a great job in creating a system to predict defect location on line-level – DeepLineDP. The requirements for the system are partially based on a survey conducted by the authors with developers.

According to the authors: “DeepLineDP is 14%-24% more accurate than other file-level defect prediction approaches; is 50%-250% more cost-effective than other line-level defect prediction approaches; and achieves a reasonable performance when transferred to other software projects. These findings confirm that the surrounding tokens and surrounding lines should be considered to identify the fine-grained locations of defective files (i.e., defective lines). “

I like this work and I recommend everyone interested in how to use deep learning for code tasks to look at this work.

Our team has done some of these investigations ourselves. You can watch them on Youtube here:

Author: Miroslaw Staron

I’m professor in Software Engineering at Computer Science and Engineering. I usually blog about interesting articles (for me) and my own reflections on the development of Software Engineering, AI, computer science and automotive software. View all posts by Miroslaw Staron