I’m professor in Software Engineering at IT faculty. I usually blog about interesting articles (for me) and my own reflections on the development of Software Engineering, AI, computer science and automotive software.
In academia, the motto is “publish or perish”, with the emphasis on publishing. It’s for a good reason – we, academics, scholars, researchers, exist in a complex network of dependencies. We need others to get inspiration, understanding and when we get stuck.
If you look at the nobel prize winners, most of them work together. Listening to them I get an impression that you cannot become great by sitting in your own room and hatching ideas. But, at the same time, we are often introverts, at least I am.
This book is a great example of how we can build our networks and make meaningful connections. It helped me to realize how to be good at meaningful networking, not the one where you focus on meeting as many people as possible or as important people as possible. No, it’s about how to meet all kinds of people and how to learn from them. It’s about how to identify even a single item of information that you can use in your own work and for your own benefit.
I recommend this as a reading for one of those dark, autumn evenings that are inevitable coming now….
As a software engineer, I take hardware for granted. Moore’s law has taught me that all kind of computing power grows. My experience has taught me that all computing power is then consumed by frameworks, clouds and eventually is not enough.
This great book shades a really interesting light on the way in which materials like Lithium and Silicon shape our society. We think that TSMC is one of the isolated companies that excelled in chip-making. The reality is that this company is great, but it is also only one in a long chain of suppliers of the chip industry. We learn that the sand which is used to make chips comes from the US, not from Taiwan. We learn that the lithium used in our batteries comes often from the Andes, Chile, not from China. We also learn that the ONLY way for the humanity to progress is to collaborate cross-nations. If we don’t do that, no single country in the world has the machinery, the know-how and the competence to develop our modern technology.
It is in a series of great readings for software engineers when they start their studies today.
Although this book is a bit dated – in the sense when we call everything that is pre-pandemic as dated – it is a great reading. It takes us on a journey of inventions in Silicon Valley, although it starts with Ada Byron and her work on computing machines.
I recommend this book because it goes against established theories in academia about innovations – that we innovate individually or in teams. Instead, it takes us on the journey of connections, research and innovation building on one another. It tells a great story how world’s technology evolves by taking one innovation and making another one. It is a story of global collaborations and how these collaborations are entangled with one another and support one another.
If it was up to me, this would be a mandatory reading for all new students of the university software engineering programs.
During the holidays, I’ve had some time to read books outside of software engineering. This one caught my attention, not because I like conspiracy theories, but mostly because I got interested in how theories actually spread.
Now, while this book is about conspiracies, don’t get me wrong, it is also about creating communities and making things spread. Whether this is a conspiracy or maybe just a normal culture, it is interesting to read.
Anyways, if you have some time on your hands, instead of scrolling Instagram or Facebook feeds, take a look at this book. I guarantee that it will provide even more fun.
I have written a lot about code reviews. Not because this is my favourite activity, but because I think that there is a need for improvement there. Reading others’ code is not as fun as we may think and therefore making it a bit more interesting is much desirable.
This paper caught my attention because of the practical focus of it. In particular, the abstract caught my attention – they claim that changing the order of files presented for review makes a lot of difference. Up to 23% more comments are written when the files are arranged in the right order. Not only that, the quality of the comments seems to increase too. More tips:
Re-order Files Based on Hot-Spot Prediction: The study found that reordering the files changed by a patch to prioritize hot-spots—files that are likely to require comments or revisions—improves the quality of code reviews. Implementing a system that automatically reorders files based on predicted hot-spots could make the review process more efficient, as it leads to more targeted comments and a better focus on critical areas.
Focus on Size-Based Features: The study highlighted that size-based features (like the number of lines added or removed) are the most important when predicting review activities. Emphasizing these features when prioritizing files or creating models for review could further streamline the process.
Utilize Large Language Models (LLMs): LLMs, such as those used for encoding text, have shown potential in capturing the essence of code changes more effectively than simpler models like Bag-of-Words. Incorporating LLMs into the review tools could improve the detection of complex or nuanced issues in the code.
Automate Hot-Spot Detection and Highlighting: The positive impact of automatically identifying and prioritizing hot-spots suggests that integrating such automation into existing code review tools could significantly enhance the efficiency and effectiveness of the review process.
Sounds like this is one of the examples where we can see the large benefits of using LLMs in code reviews. I hope that this will make it into more companies than Ubisoft (partner on that paper).
I’ve asked ChatGPT to provide me an example of how to create such a hotspot model and it seems that this can be implemented in practice very easily. I will not paste it here, but please try for yourself.
Engineering software today becomes a profession where AI can make the most difference. GitHub and the availability of open source code, as well as natural language texts, provided the best possible fuel for creating large and great models. Some predict that we will not need programmers any more, but I predict that we will need them. I also think that these programmers will be much better than they are today. I view the future as quite optimistic and bright for our profession.
To begin with, I see the profession to change in a way that one single programmer will be more like a team leader for a number of LLMs or bots – hence the title. The programmer will use one model/bot for extracting requirements from standards, documents, twitters or other text sources. The requirement bot will help the programmer to find the user stories/requirements, to prioritize them and to set up a backlog.
Then, the programmer will use another bot to create high-level design. The bot will provide an initial design and will provide the programmer with some sort of conversational interface to reason about the design – the patterns, the architectural styles and the typical non-functional characteristics to maintain for the software.
At the same time, the programmer will use either the same bot or another one to write the source code of the software. We already use tools like GitHub CoPilot for these tasks and these tools will be even better. When constructing the software, the programmer will also use testing bots to create test cases and to improve the programs. Here, the work on program repair is definitely very interesting as it provides the ability to automatically improve the low-level design of the software.
Finally, when the software is complete, then during the release, the programmer will use bots to monitor the software. Finding defects fast, monitoring the performance, availability and security of software will be delegated to even more bots. They will do the work faster than any programmer anyways.
The job of the programmer will then be to instruct, integrate and monitor these bots. A little bit like a team leader who needs to make sure that all team members agree on certain principles. This means that the programmers will need to know more about the domains where they work as well as they will need access to tooling that supports their workflows.
What will also change is our perception of quality of software. If we can use these bots and make the software construction much faster, then we probably will not need to write super-maintainable code – the bots will be able to decipher even the most obscure code and help us to improve it. Hey, maybe these bots will even write a completely new version of our software once they learn how to optimize their design and when we learn how to link these bots to one another. Regardless, I do not think they will be able to write a Skynet kind of code….
In the rapidly evolving artificial intelligence (AI), deep learning (DL) has become a cornerstone, driving advancements based on transformers and diffusers. However, the security of AI-enabled systems, particularly those utilizing deep learning techniques, are still questioned.
The authors conducted an extensive study, analyzing 3,049 vulnerabilities from the Common Vulnerabilities and Exposures (CVE) database and other sources. They employed a two-stream data analysis framework to identify patterns and understand the nature of these vulnerabilities. Their findings reveal that the decentralized and fragmented nature of DL frameworks contributes significantly to the security challenges.
The empirical study uncovered several patterns in DL vulnerabilities. Many issues stem from improper input validation, insecure dependencies, and inadequate security configurations. Additionally, the complexity of DL models makes it harder to apply conventional security measures effectively. The decentralized development environment further exacerbates these issues, as it leads to inconsistent security practices and fragmented responsibility.
It does make sense then to put a bit of effort into securing such systems. By the end of the day, input validation is no rocket science.
So far, we have explored two different kinds of code summarization – either using a pre-trained model or training our own. However, both of them have severe limitations. The pre-trained models are often good, but too generic for the project at hand. The private models are good, but often require a lot of good data and processing power. In this article, the authors propose to use a third way – federated learning.
The results show that:
Fine-tuning LLMs with few parameters significantly improved code summarization capabilities. LoRA fine-tuning on 0.062% of parameters showed substantial performance gains in metrics like C-BLEU, METEOR, and ROUGE-L.
The federated model matched the performance of the centrally trained model within two federated rounds, indicating the viability of the federated approach for code summarization tasks.
The federated model achieved optimal performance at round 7, demonstrating that federated learning can be an effective method for training LLMs.
Federated fine-tuning on modest hardware (40GB GPU RAM) was feasible and efficient, with manageable run-times and memory consumption.
I need to take a look at this model a bit more since I like this idea. Maybe this is the beginning of the personalized bot-team that I always dreamt of?
CoPilot and other tools have been used increasingly often, but mostly for testing and for programming. Now, the questions is whether these kind of tools help much in such tasks as code review.
This paper explores how developers utilize ChatGPT in the code review process and their reactions to the AI-generated feedback. This research analyzed 229 review comments from 205 pull requests across 179 projects to understand the purposes and effectiveness of ChatGPT in code review.
The found that:
Developers primarily use ChatGPT for two main purposes: referencing and outsourcing. These purposes were further categorized into several sub-categories:
Developers used ChatGPT to gain understanding and support their opinions, including tasks such as refactoring, implementation, design, non-programming tasks, testing, documentation, and others.
Developers directly asked ChatGPT to resolve specific issues. This included implementation, refactoring, bug-fixing, reviewing, testing, design, documentation, and other tasks.
The study found a mixed reaction to ChatGPT-generated reviews, which is not really surprised given that it is a new technology and code reviews are not only for review, but also for learning:
Positive Reactions (64%): A majority of the responses were positive, indicating that developers found the AI’s suggestions helpful.
Negative Reactions (30.7%): A significant portion of responses were negative. The primary reasons for dissatisfaction included the solutions not bringing extra benefits, containing bugs, or not aligning with developers’ preferred coding styles.
However, what I find really interesting.
Enhanced Prompt Strategies: Effective use of ChatGPT requires well-crafted prompts to maximize the AI’s potential in generating useful code reviews.
Tool Integration: Integrating ChatGPT with existing development tools can streamline the review process.
Continuous Monitoring: Regular assessment and refinement of ChatGPT’s outputs are necessary to ensure high-quality code reviews.
These three points are kind of cool, because they mean that we need to learn how to instruct and use these tools. That means that we loose something (knowledge about our products), while we need to learn more general skills about prompting…
Again, on the topic of generative AI for programming. I’ve found this interesting article that reviewed the state of the adoption. It examines the use of large language models (LLMs) like ChatGPT and GitHub Copilot in software development. In short, they find that:
ChatGPT and Copilot dominate code generation on GitHub, primarily for small projects led by individuals or small teams.
These tools are mainly used for Python, Java, and TypeScript, generating short, low-complexity code snippets.
Projects with LLM-generated code evolve continuously but exhibit fewer bug-related modifications.
So, although so many LLMs exist, it is still ChatGPT and CoPilot that have the largest share of the market. IMHO this is because of the ecosystem. It’s not enough to have an LLM, but we need to be able to access internet, interact with the model and also get it to be trained using our examples.