Merry X-mas and the next year with AI

Image by Peter Pieras from Pixabay

Sparse reward for reinforcement learning‐based continuous integration testing – Yang – – Journal of Software: Evolution and Process – Wiley Online Library

This is the last post that I want to write in 2021. The year has been hectic and full of surprises. First, we got the news that the vaccine works for Covid-19. We all prepared for normalization, for being able to travel, visit friends, families, and conferences in person.

Then came the new variants, like the Omikron, which seem to escape from the vaccine, and countries still are not ready for opening. Conferences get postponed, trips canceled. I hope this is just a temporary situation and that we will be able to get in control of the situation again.

For the last post in 2021, I chose one of the articles that I’ve recently read – about the use of reinforcement learning in integration testing. Kind of a different approach to what we do in the Software Center project.

This paper tackles the problem of sparse rewards for fitness functions when using reinforcement learning for test selection. It proposes a combination of historical data and a function that assigns a higher reward for non-sparse data. It looks like the work is very promising, as it has been tested on 14 different industrial data sets. I need to check if during the coming holidays. It’s a project to do for X-Mas

With that, I would like to thank all of you for being here with me during 2021 and hope that we can continue in 2022. Wish you all great holidays and the best of luck in the coming 2022!

From the abstract:

“Reinforcement learning (RL) has been used to optimize the continuous integration (CI) testing, where the reward plays a key role in directing the adjustment of the test case prioritization (TCP) strategy. In CI testing, the frequency of integration is usually very high, while the failure rate of test cases is low. Consequently, RL will get scarce rewards in CI testing, which may lead to low learning efficiency of RL and even difficulty in convergence. This paper introduces three rewards to tackle the issue of sparse rewards of RL in CI testing. First, the historical failure density-based reward (HFD) is defined, which objectively represents the sparse reward problem. Second, the average failure position-based reward (AFP) is proposed to increase the reward value and reduce the impact of sparse rewards. Furthermore, a technique based on additional reward is proposed, which extracts the test occurrence frequency of passed test cases for additional rewards. Empirical studies are conducted on 14 real industry data sets. The experiment results are promising, especially the reward with additional reward can improve NAPFD (Normalized Average Percentage of Faults Detected) by up to 21.97%, enhance Recall with a maximum of 21.87%, and increase TTF (Test to Fail) by an average of 9.99 positions. “

A Friday research and pedagogy reflection post…

Image by pixabay

It’s Friday again and I’m trying to pack things up for the weekend. While doing that I reflected a bit on the week that passed. It started with the meetings on research directions, but it ended in discussing and thinking about pedagogy.

At the beginning of the week, I focused on preparing for an evaluation of a tool, read about VAEs and the disentanglement problem as well as looked at the new datasets. It’s all cool and interesting and kind of on the edge. It is also in such a stage that it works mostly for the well-known and annotated datasets, while it works a bit worse on the datasets that come from real-life – e.g. from driving a car in the city, where there are tens of objects in the picture.

However, my week ended by talking about pedagogy. I’ve had a chance to listen to our excellent teachers at the University of Gothenburg and get their reflections on the year that passed. To be honest, I did not see that coming and I did not expect what I heard. Many positive things, but also a confirmation that we, as a university, focus too little on pedagogy and teaching. It’s the third time I get to reflect on this, so I need to do something about it.

Second, I also listened to and reflected upon, the challenges of Ph.D. students today. They need to publish in an increasingly higher tempo. As our discipline matures, the quality of publications increases and so do the requirements for the Ph.D. students. They also face an uncertain future as the research funding decreases, the number of positions decreases, and the tenure tracks positions are no longer “forever”.

There are also highlights of this week. We had a great discussion at one of our steering groups about the companies involved in our research (which is impressive). We also got a number of new research projects associated, we research results and, finally, the ALC (Active Learning Classroom) has been finished.

With that, my friends, I leave off for the weekend.

Noisy data, biased data – book review

Image by Aaron J from Pixabay

Noise: A Flaw in Human Judgment: Kahneman, Daniel, Sibony, Olivier, Sunstein, Cass R.: 9780316451406: Amazon.com: Books

It’s been a while since I’ve written my last post. Well, hectic times I guess. Old friends leaving the spot, new friends entering the spot – a life of a researcher.

While working on my recent research projects, I was wondering about one thing – is there a correlation between noise in data and noise in judgement/decisions?

Let me explain the problem first. In a perfect world, in a galaxy far, far away, all data is perfect. All pictures are labelled correctly, natural language has a formal meaning and all data points are assigned to their classes perfectly. In this perfect world, the interpretation of the data is also unambiguous and independent of who does the interpretation. In that perfect world, this means that machines can take all decisions and we, as humans, can relax.

But, we do not live in that perfect world. In our world, there is data that is not always correct and the language is imprecise. We are also biased by many factors, as humans. In this world of ours, this means that a lof of things is a “judgement call”, which means that training a machine to take decisions is not always correct.

So, I was thinking, if we clean up the noise, will the decisions be unbiased? If we train the persons making decisions, will the decisions be more correct?

I’ve looked at one of the recent works of the Nobel Prize winner (Daniel Kahneman) and his colleagues. They describe what is noise and bias in terms of where they come from and how to find them. This book builds upon the principles of statistical error (and its measurement) as well as our ability to handle the error in terms of the ‘wisdom of the crowd’. It also shows how using more processes reduces bias and introduces order to the chaos of our galaxy.

I would like to leave you with this thought – we have the whole Agile software development movement, focused on humans and products, not processes. But if it is the processes that actually bring some order, aren’t we just introducing more chaos by being more Agile?

“That will never work” – A book about Netflix

That Will Never Work: The Birth of Netflix by the first CEO and co-founder Marc Randolph : Randolph, Marc: Amazon.se: Böcker

Building a successful start-up seems like a really cool idea – from a distance. I’ve used to teach a course about enterpreneurship, start-up, business models and alike. Although it was nice, I always felt that I’m a person who knows absolutely nothing about this. At least not in practice…

In this book, the original founder of Netflix tells his story about how he took the idea and made it into a product. He tells the story about how the idea hatched, how he, and his team, created a data-driven model of understanding their customers. The book is also about the struggles of start-ups – about taking on investments from the beginning and then being pushed out of the company. It’s about being able to understand what’s best for the company and what’s best for the individual.

I like the way in which the authors describes the story, and also shows a bit of himself: how he felt, how he wanted to build the company and how he decided when to leave (with grace!). I also like his ending of the book – Nobody knows anything! which is a saying that you never really knows what will and will not work in the end.

I recommend this as a Sunday reading to get inspired.

Is software architecture and code the same?

BIld av Stefan Keller från Pixabay

Relationships between software architecture and source code in practice: An exploratory survey and interview – ScienceDirect

Software architecting is one of the crucial activities for a success of your product. There is a BAPO model, there B stands for Business and A for Architecture – and there is a good reason why it is on the second place. It should not dictate your business model, but it should support it.

Well, it is also good that the architecture comes before processes and organization. If software is your product, then it should dictate how you work and how you are organized.

But, how about the software code? For many software programmers and designers, the architecture is a set of diagrams which show logical blocks and software organization, but they are not the ACTUAL code, not the product itself. In one of our research project we study exactly that kind of problem – how to ensure that we keep both aligned, or more accurately, how we can use machine learning to keep the code and architecture synchronized.

Note that I use the word synchronized, not aligned or updated. This is to avoid one of many misconceptions about software architectures — that they are set once and for all. Such an assumption is true for architectures of buildings, but not software. We are, and should be, more flexible than that.

In one of the latest Information and Software Technology issues, I found this interesting study. It is about how architects and programmers perceive software architectures. It shows how architectures evolve and why they are often outdated. It is a survey and I really like where it’s going. Strongly recommend to read if you are into software architectures, programming and the technical side of software engineering….

Cybersecurity, security, and safety…

Image by Robinraj Premchand from Pixabay

During the spring semester, my students did great work looking into the security of a car’s electrical system. They managed to decode signals, understand high-level data, and managed to perform small changes in the car’s function.

It all sounds great as thesis project. Both the students and the company loved this project. It was challenging, it was new, it was useful. But I’m not writing this post about that. I want to write about what has happened, or not happened, after that.

In the months that came after the thesis, I decided to look into mechanisms for how to design and implement secure software. Being a programmer at the bottom, I turned to GitHub for help. I search for tools and libraries for secure software design. I know, I could have searched for something different, but let’s start there.

The results were :

Analysis frameworks:

There were more of these, but most of the same kind. I was a bit amazed by the fact that there is so little outside of web design. I also looked at some of the research in this area (no systematic review, I promised myself not to do one). There I found all kinds of work, but mostly theoretical. The areas of interest:

  • Cryptography: how to encode/decode information, keys, passwords.
  • Secure software design: mostly analysis of vulnerabilities
  • Secure systems: mostly about passwords and vulnerabilities.
  • Privacy: how to keep the private information hidden from third parties (kind of security, but mostly something else – I’m still waiting to understand what).
  • Legacy operations: how to make the software long-lived and provide it with secure infrastructure.
  • Infrastructure: security of the cloud environments, end-to-end security.

Since I worked with software safety, I thought that it would be very similar. However, it was not. The safety community discussed, mostly, standardization, hazards, risks. Very little about code analysis, finding unsafe code, etc. So, mostly something different.

I’ll keep digging and I will run a few experiments with some of my students to understand what the technology could be. However, I’m not as optimistic as I was at the beginning of my search.

Open or close – how we can leverage innovation through collaboration (book review)

Open: The Story of Human Progress : Norberg, Johan: Amazon.se: Böcker

Progress and innovation are very important for the development of our societies. Software engineers are focused on the progress in technology, software, frameworks, and the ways to develop software.

This book is about openness and closeness in modern society. It is a story showing how we benefit from being open and collaborative. I could not stop myself from making parallels to the original work about open software – “The Cathedral and The Bazaar” by Eric Raymond. Although a bit dated, the book opened my view on the open source movement.

We take for granted that we have Linux, GitHub, StackOverflow and all other tools for open collaboration, but it wasn’t always like that. The world used to be full of proprietary software and software engineers were people who turned requirements into products. It was the mighty business analysts who provided the requirements.

Well, we know that this does not work like that. Software engineers are often working on product – they take ownership of these products, they feel proud to create them. It turns out that the openness is the way to go here – when software engineers share code, they feel that they contribute to something bigger. When they keep the code to themselves, … well, I do not know what they feel. I like to create OSS products, docker containers and distribute them. Kind of feels better that way!

How to make your pull request merged quickly… (article highlight)

Image source: pixabay.com

How Developers Modify Pull Requests in Code Review | IEEE Journals & Magazine | IEEE Xplore

I must admit that I’m not the greatest contributor to OSS projects. Yes, I did a few of those and contributed to projects, but this is more like a hobby than a real work. My goal for 2022 is to make it better and even put together some docker containers to make my scripts more reusable. I even bought a book about Docker, which I’ve read and (theoretically) I’m good to go.

Anyways, I stumbled upon this work which is about how developers make good pull requests. The paper has examined OSS projects and found that you need to make a clear change as part of the pull request, you need to make a clear classification of that change and then you have a high chance that the pull request will be adopted soon.

Stay tuned for more on code reviews…

Predicting defect, but continuously (article highlight)

Continuous Software Bug Prediction (yorku.ca)

Although a lot has been written about predicting defects, the problem is still valid. Some systems have more defects than others. In academia, we can do two things – educate young engineers in making better software or construct models for predicting where and when to find defects.

A lot of work in the defect prediction model development focuses on more-or-less randomly found releases. However, software development is not random, but structured, and often, continuous. This means that it’s important to understand that not all defects are found in the same release/path/commit as they are introduced (BTW: there is a lot of work on this aspect too).

In this work, the authors analyze 120 continuous releases of six software products and demonstrate the value of their prediction models. The novelty of this approach is a system that checks whether the releases are similar to one another based on the distributional characteristics. This means that the prediction models are tuned to each release based on these characteristics. These characteristics are, mostly, well-known metrics like the average cyclomatic complexity of a file, a MaxInheritanceTree of a class, etc. So – easy to collect and analyze, a lot of tools can be used for that.

The results, in short, show that the new method is better than randomly choosing a release or bagging releases. The results differ per project, but the approach is better than the other two, across the board.

I like the approach and will try it the next time I get my hands on software defects, issues, challenges. Let’s see when that happens:)

Guiding the selection of research methodologies (article highlight)

Image by Gerd Altmann from Pixabay

Guiding the selection of research methodology in industry–academia collaboration in software engineering – ScienceDirect

Research methodology is something that we must follow when conducting research studies. Without a research methodology, we just search for something and if we find it, we do not know if this finding is universal, true, or even if it really exists…

In my early works, I got really interested in empirical software engineering, in particular in experimentation. One of the authors of this article was one of my supervisors and I fell for his way of understanding and describing software engineering – as an applied area of research.

Over time, I realized that experimentation is great, but it is still not 100% what I wanted. I understood that I would like to see more collaboration with software engineers in the industry, those who make their living by programming, architecting, testing, modifying the code. I did a study at one of the vehicle manufacturers in Sweden, where I studied the complexity of the entire car project. There I understood that software engineering needs to be studies and practices in the industry. Academia is the place where we shape young minds, where we can gather multiple companies to share their experiences, and where we can make findings from individual cases into universal laws.

In this article, the authors discuss research methodologies applicable for industrial, or industry-close research. They discuss even one of the technology transfer models as a way of research co-production and co-validation.

The authors conclude this great overview in the following way (from the conclusions):

When it comes to differences, the three methodologies differ in their primary objective: DSM on acquiring design knowledge through the design of artifacts, AR on change in socio-technical systems, and TTRM on the transfer of research to industry. The primary objective of one methodology may be a secondary objective in another. Thus, the differences between them are more in their focus than in which activities they include.

In our analysis and comparison of their feasibility for industry–academia collaboration in software engineering research, the selection depends on the primary objective and scope of the research (RQ3). We, therefore, advice researchers to consider the objectives of their software engineering research endeavor and select an appropriate methodological frame accordingly. Furthermore, we recommend studying different sources of information concerning, in particular, the chosen research methodology to better understand the methodology before using it when conducting industry–academia collaborative research.

I will include this article as mandatory reading in my AR Ph.D. course in the future.