Happy 2025! Let’s make it a great year full of fantastic research results and great products. How to achieve that goal? Well, let’s take a look at this paper about guidelines for conducting action research.
These guidelines are based on my experiences with working as software engineer. I’ve started my career in industry and even after moving to academia I stayed close to the action – where software gets done. Reflecting on the previous years, I’ve looked at my GitHub profile and realized that only two repositories are used in industry. Both are used by my colleagues from Software Center, who claim that this software provided them with new, cool possibilities. I need to create more of this kind of impact in 2025.
We often talk about GenAI as it is going to replace us. Well, maybe it will, but given what I saw in programming, it will not happen tomorrow. GenAI is good at supporting and co-piloting human programmers and software engineers, but it does not solve complex problems such as architectural design or algorithm design.
In this article, the authors pose an alternative thesis. They support the thesis that GenAI should challenge humans to be better and to unleash their creativity. In this piece, the authors identify the use of AI to provoke things like better text headlines for articles, identifying non-tested code, dead-code or other types of challenges.
They finish up the article with the thesis that we, universities, need to be better at teaching critical thinking. So, let’s do that from the new year!
In this time just before X-Mas, I sat down to read the latest issue of the Communications of the ACM. There are a few very interesting articles there, starting from a piece from Moshe Verdi on the concept of theoretical computer science, through an interesting piece of text on artificial AI to a very interesting article that I’m writing about now.
The starting point of this article is the fact that we, software engineers, are taught that we should talk to our customers, discover requirements together with them and validate our products together with them. At the same time, we design AI Engineering software without this in mind. A lot of start-ups (I will not mention any, but there are many) rush into providing tools that use LLMs to support software development tasks such as programming. However, we do not really know what the developers want.
In this article, they present a survey of almost 1,000 developers on what they want. Guess what – programming is NOT in the top three on this list. Testing, debugging, documentation or code analysis are the top requests. The developers enjoy creating code, what they do not enjoy is finding bugs or testing the software – it takes time and is not extremely productive. Yes, it feels great what you find you bug and yes, it feels great when the tests finally pass, but it feels even greater when you work on new feature or requirement.
We follow the same principle in Software Center. When creating new tools, we always asks the companies what they really need and how they need it. Now, we work on improving the process of debugging and defect analysis in CI/CD. We started by a survey. You can find it here. Please register if you want to see the results of the survey – and contribute!
With this, I would like to wish you all a Merry Christmas and a Happy New Year. Let’s make 2025 even better than 2024!
I’m a big fan of Yuval Noah Harari’s work. A professor who can write books like no one else, one of my role models. I’ve read Sapiens, Homo Deus and 21 Lessons… now it was time for Nexus.
The book is about information networks and AI. Well, mostly about the information networks and storytelling. AI is there, but not as much as I wanted to see. Not to complain, Harari is a humanist and social scientists, not a software engineer or computer scientists.
The book discusses what information really is and how it evolves over time. It focuses on storytelling and providing meaning for the data and the information. It helps us to understand the power of stories and the power of information – one could say that the “pen is mightier than the sword”, and this book delivers on that.
I recommend this as a reading over X-Mas, as the holidays are coming.
Quantum computing has been around for a while now. It’s been primarily a playground for physicists and computer scientists close to mathematics. The major issue was that the error rates and instability of the quantum bits prevented us from using this kind of paradigm on a larger scale (at least how I understand it).
Now, it seems that we are getting close to commercialization of this approach. Several companies are developing quantum chips that will allows us to use more of this technology in more fields.
The paper that I want to bring up today discusses what kind of challenges we, software engineers, can solve in quantum computing – and it is not programming. We need to work more on requirements, architecture, reuse of software and quality of it. So, basically the typical software engineering aspects.
I’ve just finished reading this great book about the way in which the tipping point tips to the wrong side. It’s mostly about the law of “The large effect of the few” as Malcolm Gladwell puts it. In short, this law means that in certain situations, it’s the minority that is responsible for large effects. For example, the minority of old, badly maintained cars that contribute to to over 55% of pollution in one of the US cities. It’s about when one person, a superspreader, ends up in very specific conditions that allow this person to spread the contagion of the COVID virus at the beginning of the pandemic.
Now, we see that in software engineering a lot when we look at the tooling that we use. Let’s take the CI/CD tool Jenkins as an example. It is one of many different tools that were on the market at that time. It was not even the major one, but it was a sibling to a professional tool that was maintained by Oracle (if I recall correctly). Yet, it became very popular and the other tools did not. Since they were siblings, they were not worse, not better either; maybe a little different. What made it tip was the adoption of this tool in the community. A few superspreaders started to use it and discovered how good the tool is for automation of CI/CD tasks.
I see the same parallel to AI today. What was it that tipped the use of AI? IMHO it was a few things:
Google’s LSTM use in Search – since there was a commercial value, it made sense to adopt it. Commercial adoption means business value, improvement and management focus (funding).
Big data – after almost a decade of talking about big data, collecting it and indexing it, we were ready to provide the data-hungry modules with the data they needed to do something useful.
HuggingFace – our ability to share models and use them without requirements on costly GPUs and large (and good) datasets.
Access to competence – since we have so many skilled computer scientists and software engineers, it was easy to get hold of the competence needed to turn ideas into products. Google’s Deepmind is a perfect example of it. People behind it got the Nobel Prize.
Well, the rest is history as they say…. But, what will the next invention on the verge of the tipping point be?
During the holidays, I’ve had some time to read books outside of software engineering. This one caught my attention, not because I like conspiracy theories, but mostly because I got interested in how theories actually spread.
Now, while this book is about conspiracies, don’t get me wrong, it is also about creating communities and making things spread. Whether this is a conspiracy or maybe just a normal culture, it is interesting to read.
Anyways, if you have some time on your hands, instead of scrolling Instagram or Facebook feeds, take a look at this book. I guarantee that it will provide even more fun.
Engineering software today becomes a profession where AI can make the most difference. GitHub and the availability of open source code, as well as natural language texts, provided the best possible fuel for creating large and great models. Some predict that we will not need programmers any more, but I predict that we will need them. I also think that these programmers will be much better than they are today. I view the future as quite optimistic and bright for our profession.
To begin with, I see the profession to change in a way that one single programmer will be more like a team leader for a number of LLMs or bots – hence the title. The programmer will use one model/bot for extracting requirements from standards, documents, twitters or other text sources. The requirement bot will help the programmer to find the user stories/requirements, to prioritize them and to set up a backlog.
Then, the programmer will use another bot to create high-level design. The bot will provide an initial design and will provide the programmer with some sort of conversational interface to reason about the design – the patterns, the architectural styles and the typical non-functional characteristics to maintain for the software.
At the same time, the programmer will use either the same bot or another one to write the source code of the software. We already use tools like GitHub CoPilot for these tasks and these tools will be even better. When constructing the software, the programmer will also use testing bots to create test cases and to improve the programs. Here, the work on program repair is definitely very interesting as it provides the ability to automatically improve the low-level design of the software.
Finally, when the software is complete, then during the release, the programmer will use bots to monitor the software. Finding defects fast, monitoring the performance, availability and security of software will be delegated to even more bots. They will do the work faster than any programmer anyways.
The job of the programmer will then be to instruct, integrate and monitor these bots. A little bit like a team leader who needs to make sure that all team members agree on certain principles. This means that the programmers will need to know more about the domains where they work as well as they will need access to tooling that supports their workflows.
What will also change is our perception of quality of software. If we can use these bots and make the software construction much faster, then we probably will not need to write super-maintainable code – the bots will be able to decipher even the most obscure code and help us to improve it. Hey, maybe these bots will even write a completely new version of our software once they learn how to optimize their design and when we learn how to link these bots to one another. Regardless, I do not think they will be able to write a Skynet kind of code….
In the rapidly evolving artificial intelligence (AI), deep learning (DL) has become a cornerstone, driving advancements based on transformers and diffusers. However, the security of AI-enabled systems, particularly those utilizing deep learning techniques, are still questioned.
The authors conducted an extensive study, analyzing 3,049 vulnerabilities from the Common Vulnerabilities and Exposures (CVE) database and other sources. They employed a two-stream data analysis framework to identify patterns and understand the nature of these vulnerabilities. Their findings reveal that the decentralized and fragmented nature of DL frameworks contributes significantly to the security challenges.
The empirical study uncovered several patterns in DL vulnerabilities. Many issues stem from improper input validation, insecure dependencies, and inadequate security configurations. Additionally, the complexity of DL models makes it harder to apply conventional security measures effectively. The decentralized development environment further exacerbates these issues, as it leads to inconsistent security practices and fragmented responsibility.
It does make sense then to put a bit of effort into securing such systems. By the end of the day, input validation is no rocket science.
So far, we have explored two different kinds of code summarization – either using a pre-trained model or training our own. However, both of them have severe limitations. The pre-trained models are often good, but too generic for the project at hand. The private models are good, but often require a lot of good data and processing power. In this article, the authors propose to use a third way – federated learning.
The results show that:
Fine-tuning LLMs with few parameters significantly improved code summarization capabilities. LoRA fine-tuning on 0.062% of parameters showed substantial performance gains in metrics like C-BLEU, METEOR, and ROUGE-L.
The federated model matched the performance of the centrally trained model within two federated rounds, indicating the viability of the federated approach for code summarization tasks.
The federated model achieved optimal performance at round 7, demonstrating that federated learning can be an effective method for training LLMs.
Federated fine-tuning on modest hardware (40GB GPU RAM) was feasible and efficient, with manageable run-times and memory consumption.
I need to take a look at this model a bit more since I like this idea. Maybe this is the beginning of the personalized bot-team that I always dreamt of?