Who and when needs automated code reviews…

https://rdcu.be/caKsW

Image by Arek Socha from Pixabay

Having worked with code reviews for a while, I strongly sympathize with the thesis put forward by the authors of this paper – code review tools are still far from being supporting for software developers.

Yes, they do automate the process and organize it. Yes, they help in assuring that all code is reviewed and yes, they do help to capture problems in the code and help to spread the knowledge.

However, what I expect from such a tool is to help me to find problems in the code. I would like to have a tool that would help me, as a designer, get better: avoid mistakes, use cool programming constructs, make better design. None of the tools I know help with that.

This paper shows that my understanding is similar to the developers studied in the paper. Documentation – automatically fixing and suggesting were top priority. Renaming suggestions, commenting and explaining were some others.

Detection of duplicated code, architectural analysis and similar things were also mentioned as expectations. I cannot agree more! These things are priority 1 – I would also expect them to be there.

Now, some are more difficult that others – like analyzing the architecture. Not a trivial task at all, cause what is the architecture? Where are the patterns? How to find it from the code? How to rely on the tools that research provides? W’re not there yet.

Duplicate code, however, is something we should be able to fix. I’ve looked at some repository that had over 200 papers about code clones, duplicates and what have you. Are all these papers good? Probably not, but even if 10% is good, then here we have 20 tools we can try.

I agree, we do have SonarQube and similar tools, but they are not integrated with code review. I cannot just link to a report from SQ when writing a review comment. I cannot add a review comment to a detected technical debt in SQ. So, no integration then?

Maybe it’s just a friday afternoon thing, but I hope that we can get better in making the last mile with our tools. Hope that we can address the expectations that the developers have …

Data analytics in SE

https://www.sciencedirect.com/science/article/abs/pii/S0950584920301981

Image by Werner Weisser from Pixabay

A few years ago, data analytics and big data were super popular in software engineering. In fact, they were a bit too popular, as many authors quoted big data because they had a diagram in the paper.

Fast forward to today and the situation is a bit different. We are more mature in using data in software development. We know that Big data is about the 5 Vs and that we can reason about it. We also know what providing the diagrams is not the same as using them to direct software development.

I found this paper when looking for literature for our new work on communication in software metrics teams. My colleagues study the communication and found that there can be several sources of confusion. Now, this paper is NOT about the confusion, but about prevalence of data analytics in software engineering. The working definition of Big Data Analytics is as follows in the paper: “Big data analytics is the process of using analysis algorithms running on powerful supporting platforms to uncover potentials concealed in big data, such as hidden patterns or unknown correlations”.

The paper poses three main research questions about the studies conducted in Big Data Analytics, about the approaches used and when they are used. I’m mostly interested in the second – which approaches are used. There, the authors pose three sub-questions:

RQ2.1: What types of analytics have been used in the ASD domain?
RQ2.2: What sources of data have been used?
RQ2.3: What methods, models, or techniques have been utilized in the studies?

In particular, the second one is the most interesting one – sources of data. There, the authors found that there are plenty. The entire table (Table 7 in the paper) is actually too large to quote, but let me just quote one of the categories: Source code and data model:

  • Source code
  • Ruby programs & Ruby on Rails
  • Java programs
  • Function calls
  • Code metrics
  • Development repository
  • Test case
  • Code quality
  • Application data schema

I recommend this as a good reading into the current state-of-the-art in data analytics in software engineering. I think we’ve matured a lot in the last decade as a community and that brings a lot of benefit. Our software development gets better and thus our software gets better.

From the abstract: In total, 88 primary studies were selected and analyzed. Our results show that BDA is employed throughout the whole ASD lifecycle. The results reveal that data-driven software development is focused on the following areas: code repository analytics, defects/bug fixing, testing, project management analytics, and application usage analytics.

Is confusion a factor when reviewing a code?

https://www.win.tue.nl/~aserebre/EMSE2020Felipe.pdf

Image by Myriams-Fotos from Pixabay

Reviewing the code is an art. After working with the topic for a few years, we’ve realized that this is like reading a chat – one person responds to a piece of message sent by another person. The message often being the code and the response being the review comment. What we’ve discovered is that the context of the review is important as well as the possibility to ask questions. We even discuss having a taxonomy of these review comments to ease understanding of “where” in the review process one is at the moment.

This article caught my attention because it is about understading when a reviewer is actually confused when reading the code and making the comment. It’s a very nice piece of work as it combines code review comments analysis and surveys.

The results of the survey are interesting as they point out that the authors are confused much less than the reviewers – which is often caused by the fact that the comment is a response, while the code is the message. Quoting the paper: RQ1 Summary – Reasons for confusion: We found a total of 30 reasons for confusion. The most prevalent are missing rationale, discussion of the solution: non-functional, and lack of familiarity with existing code. We observe that tools (code review, issue tracker, and version control) and communication issues, such as disagreement or ambiguity in communicative intentions, may also cause confusion during code reviews.

Finally, I like the fact that the authors do a full systematic review on the topic and triangulate the results. This work will become a number one reading for my students in the programming course, which will teach them how important good code is!

From the abstract:

Results: From the first study, we build a framework with 30 reasons for confusion, 14 impacts, and 13 coping strategies. The results of the systematic mapping study shows 38 articles addressing the most frequent reasons for confusion. From those articles, we found 19 different solutions for confusion proposed in the literature, and nine impacts were established related to the most frequent reasons for confusion.

Is noise important in SE?

https://www.researchgate.net/profile/Khaled_Al-Sabbagh/publication/344190831_Improving_Data_Quality_for_Regression_Test_Selection_by_Reducing_Annotation_Noise/links/5f5a167aa6fdcc116404d72b/Improving-Data-Quality-for-Regression-Test-Selection-by-Reducing-Annotation-Noise.pdf

Image by F. Muhammad from Pixabay

Machine learning and deep learning are only as good as the data used to train them. However, even the best data sources can lead to data of non-optimal quality. Noise is one of the exampes of the data problems.

Our research team has studied the impact of noise on machine learning in software engineering – mostly on the testing data. In this paper we present one techniques to identify noise, measure it and reduce it. There are several techniques to do it, but we use one of the more robust ones – removal of noise.

I recommend to take a look at how the algorithms work and let us know if you find it interesting!

What’s going on in Deep Learning – a literature review (paper review)

https://arxiv.org/pdf/2009.06520.pdf

Image by Free-Photos from Pixabay

Deep learning in software engineering has been used extensively and there is a significant body of research about this topic. In this post, I would like to share my review of the recent systematic review on the use of DL in SE.

The interesting finding is the list of data sources for data for DL. Here, the source code data is prevalent. This is not surprising as we have GitHub with millions of repositories. The second largest is the repository metadata, again, for the same reason.

Although it is not surprising, it is really good to see this. I see it as a change in the research focus in the last 10 years. It shifted from the research on bugs and bug reports to the research on source code. I’m happy because helping out with the source code is the real improvement of the product, not an improvement of the process.

Another interesting finding is the use of natural language techniques as the most common ones, here I cite the paper: “Our analysis found that, while a number of different data pre-processing techniques have been utilized, tokenization and neural embeddings are by far the two most prevalent. We also found that data-preprocessing is tightly coupled to the DL model utilized, and that the SE task and publication venue were often strongly associated with specific types of pre-processing techniques.

I recommend to read the article to get more insight into DL models used, which are quite many – from the standard cNN to more advanced GANs and AutoEncoders. Really nice!

Finally, the paper ends with a recommendation on how to use DL in other contexts, kind of flowchart. I do not want to copy it here, so I recommend to take a look at it in the paper: https://arxiv.org/pdf/2009.06520.pdf

Classifying SE research – an article review

Image by Gerhard G. from Pixabay

https://link.springer.com/article/10.1007/s10664-020-09858-z?utm_source=toc&utm_medium=email&utm_campaign=toc_10664_25_5&utm_content=etoc_springer_20200904

Initially, I did not really think that I would put this article on the blog. I actually thought about using it in my writing advice page. However, I’ve read it and then I realized that it’s actually more suitable for this blog.

This article shows how we can classify software engineering research. It has a nice framework described in Figure 1. It organizes the framework around the concepts of who the main beneficiary is – e.g. human, system or researcher (yes, it is a different category!), type of research contribution and which research strategies are used.

It’s an article that complements the work of our colleagues from Lund University on the design of design science research studies and the construction of graphical abstracts.

Although the work seems to be obvious when you are a seasoned researcher, I need to be reminded sometimes about what kind of study that I want/need to conduct. Therefore I recommend this as a reading to both PhD candidates, master students and also advanced researchers. Using the classification scheme will definitely help us to understand each other better and to reduce the burden of paper reviews!

Action research in open source – a nice article

Image by Amílcar Vanden-Bouch from Pixabay

https://link.springer.com/article/10.1007/s10664-020-09849-0?utm_source=toc&utm_medium=email&utm_campaign=toc_10664_25_5&utm_content=etoc_springer_20200904

I’ve been looking out for good examples of articles about action research in software engineering for a while. There is a lot of those coming from the participatory design community and ethnography in software engineering.

This paper is an example of how one can conduct action research together with an open source community. It shows how to conduct the research while being part of the community and adds a new angle on the topic – how do we democratize the research design. In contrast to company-based development, an open source community is free to accept the new ways of working or not. Therefore, it can be challenging to make the action happen.

Figure 1 from the paper shows the process in more detail and I strongly recommend to take a look at it. It starts from the design of intervention, where community requirements, similar communities, best practices and problems are inputted. This similar communities precedence is new and important as it helps to leverage already adopted good practices.

The evaluation of the methodology was already done and it shows that it’s a valid and interesting new research method!

Abstract: Participatory Action Research (PAR) is an established method to implement change in organizations. However, it cannot be applied in the open source (FOSS) communities, without adaptation to their particularities, especially to the specific control mechanisms developed in FOSS. FOSS communities are self-managed, and rely on consensus to reach decisions. This study proposes a PAR framework specifically tailored to FOSS communities. We successfully applied the framework to implement a set of quality assurance interventions in the Robot Operating System community. The framework we proposed is composed of three components, interventions design, democratization, and execution. We believe that this process will work for other FOSS communities too. We have learned that changing a particular aspect of a FOSS community is arduous. To achieve success the change must rally the community around it for support and attract motivated volunteers to implement the interventions.

Technical debt from the perspective of practitioners – article review

Image by Steve Buissinne from Pixabay

https://link.springer.com/article/10.1007/s10664-020-09832-9?utm_source=toc&utm_medium=email&utm_campaign=toc_10664_25_5&utm_content=etoc_springer_20200904

Technical debt is a great metaphor in software engineering. It provides software engineers with the toolkit to communicate how bad design can affect the product in a long run, and how much it can cost to fix these problems. The metaphor has been implemented in many static analysis tools like SonarQube.

Despite its power in communicating, its not clear whether this metaphor is actually useful. It has some dark sides, which makes it a bit tricky to use it. For example, the “conversion” from a problem to the debt, e.g. lack of getter and setter methods to 0.5 days in debt, is one of these challenges. Its also not always clear which of the technical debt categories apply to which products.

What I like about this paper is that it presents a survey of technical debt. For example, it identifies the top causes of technical debt, such as:

  • deadlines,
  • inappropriate planning,
  • lack of knowledge, and
  • lack of well-defined process.

These challenges are present in most companies today, and the first two – deadlines and inappropriate planning – are often associated with start-ups and agile organizations. I recommend to take a closer look at the mindmap in the paper (Fig. 5) to dive deeper into the causes.

Quote from the abstract: We identified a total of 78 causes and 66 effects, which confirm and also extend the current knowledge on causes and effects of TD. Then, we organized the identified set of causes and effects in probabilistic cause-effect diagrams. The proposed diagrams highlight the causes that can most contribute to the occurrence of TD as well as the most common effects that occur as a result of debt.

Being wrong – book review

I’ve picked up this book to continue my reading into our ability to reason and understand the reality around us. I’ve described this journey in my previous posts, e.g. https://metrics.blogg.gu.se/?p=438

This book goes beyond the trust and explores how we can be wrong and how, sometimes, we should actually be wrong. It describes different models of being wrong and also whether it is good to be right or not.

What I find interesting is the systematic overview of the sources of errors and how we deal with it.

I recommend to take a look at the book as a bedtime reading to learn about our view on being wrong. It describes interesting situations and analyzes them from different perspectives.

Finding lines of code that require review – my 100 blog post!

Image by skeeze from Pixabay

Working with continuous integration is an exciting new filed. You get your code into the main branch directly. Well, that’s what the theory says. What you really get is feedback directly, at least the feedback from the automated checks for technical debt, testing and similar.

What you do not get quickly is the review of your code by your colleagues. In larger organizations, things like code reviews do not get prioritized. Therefore they tend to slow down software development rather than speed up!

In this paper, we set of to understand how to fix that. We used Gerrit as the tool to extract lines of code to review, instead of reviewing all of the lines. Here is a short video about this: https://play.gu.se/media/t/0_h7hx95d2

The abstract of the paper is included:

Code reviews are one of the first quality assurance tasks in continuous software integration and delivery. The goal of our work is to reduce the need for manual reviews by automatically identify which code fragments should be further reviewed manually. We conducted an action research study with two companies where we extracted code reviews and build machine learning classifiers (AdaBoost and Convolutional Neural Network — CNN). Our results show that the accuracy of recognizing code fragments that require manual review, measured with Matthews Correlation Coefficient, was 0.70 in the combination of our own feature extraction and CNN. We conclude that this way of combining automation with manual code reviews can improve the speed of reviews while providing organizations with the possibility to support knowledge transfer among the designers.