A lot of software engineering research studies use open source data and mine software repositories. It’s a common practice since it allows to test our hypotheses before asking for previous resources from our collaborating companies. By mining open source data we can also learn whether our study makes sense; we can see it as a pilot study of some sorts.
Mining software repositories has evolved into a popular activity since we got access to repositories like Github. There are even guidelines for assessing this kind of studies, e.g., https://sigsoft.org/EmpiricalStandards/docs/ and we have regulations of what we can do with the open source data – these can be in the form of a license, law (like GDPR or the CCPA) or the need for asking an ethical board for an approval. However, there is also a common sense – not everything that is legal is appropriate or ethical. We always need to ensure that no individual can be a subject to any harm as a result of our actions.
In the article that I want to bring up today, the authors discuss the ethical frameworks for ethical software engineering studies based on open source repositories. We need to make sure that:
We respect the persons, which stresses the need for approval and consent.
Beneficence, which means that we need to minimize the harm, but maximize the benefit.
Justice, which means that we need to consider each individual equally.
Respect for law and public interest, which entails conducting due diligence on which data we can use and in which way.
The most interesting part of this article is the analysis of different cases of mining software repositories. For example, the case of analyzing the code, reviews, commit messages and other types of data in the repositories.
I recommend this article for everyone who considers working with mining software repositories.
GPT technology, exemplified by the Github Copilot and its likes, changes software engineering to the ground. There is no doubt that the technology places a new tool in our engineering shed. It allows us to create software with a completely different set-up than what we are used to.
Now, what it really changes is only a few things, but these are very big ones.
Programmers —> designers and architects. GPT can write source code like no other tool on the market. And it only gets better at this. A quick glimpse at the Github Next website gives us a good understanding that this team has only got started. This changes everything we know about engineering software. Bad programmers will disappear over time. Good software designers, architects and software engineers will take their place. They will be fewer in number, but better in quality.
Software development —> software engineering. Designers will no longer get stuck in solving a small bit of a puzzle. GPT will do it for them. Instead of thinking how to write a test case, the designers will think how to test the software in the best possible way. They will focus on the engineering part of the software engineering. Something that I’m teaching my students from day one.
Consultancy —> knowledge hubs. Since programming will become easier and more approachable, we will need people who know how to solve a problem, not how to write a program. This big chunk of business of the consultancy companies will disappear. The consultancy companies will specialize in their domains and in problem-solving.
There will also be other things that will happen. Requirements will not be the same as they are. Testing will be different, architecting will be smarter and management more optimal. Knowledge will be more valued and critical thinking will be needed even more.
Well, this is my end of the academic year blog post. More to come after the summer. Stay safe!
Interestingly, this is a paper from colleagues of ours from the department. The paper presents how one company – Ericsson – works with continuous deployment of their large software system in 3G RAN (Radio Access Networks). The highlights from the article are as follows:
New software field testing and validation activities become continuous.
Software deployment should be orchestrated between the constituent system.
A pilot customer to partner with is key for success.
Companywide awareness and top management support are important.
Documentation and active monitoring are critical for continuous deployment.
I like this paper because it presents a practical approach and a good set of practices that can be taken up by other companies.
In the area of ChatGPT and increasingly larger language models, it is important to understand how these models reason. Not only because we want to put them in safety-critical systems, but mostly because we need to know why they make things up.
In this paper, the authors draw conclusions regarding how to increase the transparency of AI models. In particular, they highlight that:
The AI ethical guidelines of 16 organizations emphasize explainability as the core of transparency.
When defining explainability requirements, it is important to use multi-disciplinary teams.
The define a four-quandrant model for explainability of requirements and AI systems. The model links four key questions to a number of aspects:
What to explain (e.g., roles and capabilities of AI).
In what kind of situation (e.g., when testing).
Who explains (e.g., AI explains itself).
To whom to explain (e.g., customers).
It’s an interesting reading that takes AI systems to more practical levels and provide the ability to turn explainability into software requirements.
Wow, when I look at the last entry, it was two months ago. Well, somewhere between the course in embedded systems for my students, delegation to Silicon Valley and all kinds of challenges, the time seemed to pass between my fingers.
Well, nevertheless, I would like to put a highlight to the article from our colleagues who specialize in defect predictions and systematic reviews. The article describes how companies use defect prediction models and when they do it.
It’s a nice sunday reading for those of you who are interested in the topic. It is a good source of best practices as well as a solid source for looking for datasets for defect prediction.
So, this week’s post is my reflection on the seminar that we hosted last week (the recording is above). It was an eye-opener for me in a few aspects.
For the first, it was the question of ownership of things. Since AI is not a subject in legal cases, it cannot really own anything. I know, AI and computational models are not the same, but for the sake of the argument let’s assume that they are. By the end of the day, it is still a human being that presses the button and generates new source code or comments or what have you. So, the responsibility is still very much on us when we use these tools.
The second, it was the question about the community and why we have open-source software. We certainly do not put our source code openly for someone to profit from it. Attribution and recognition are very important (if not the most important) aspects of any open-source community. So, using their code to create commercial models requires at least some attribution. Why not show which code was used to train these models and show how good the communities really are?
Finally, my main point still stands – we should use these models to become better. They make us so much more productive that we should not go back to the old ways of writing software. Providing suggestions and ideas to programmers can make our software better, shipped faster and potentially more reliable.
However, we need to make sure that we change the way we attribute the software. Myself, I will start to add “co-created by Github Co-pilot and the OSS communities” to my work when I use the tool. Maybe you can do that too? At least to give some attribution back to our countless colleagues who deserve it….
I think that almost everyone in the field of SE has heard about the new bot – ChatGPT from OpenAI. The bot is capable of writing programs, making conversations and even creating virtual machines.
So, this year I wanted to check what ChatGPT can do. Here is my conversation with it:
M: What are the most important machine learning papers from 2022?
C: I’m sorry, but I don’t have access to papers or other information from 2022 because my knowledge was cut off in 2021. I can tell you about some important machine learning papers from the past, or provide general information about machine learning and how it has evolved over time. Is there something specific you would like to know about machine learning?
M: Ok, how about the most important papers from 2021?
C: I’m sorry, but I am not able to browse the internet and do not have access to information about specific papers or other developments that have occurred since my knowledge cutoff in 2021. However, there are many resources available online that can help you learn about the latest developments in the field of machine learning. Some places you might look for information about the most important machine learning papers from 2021 include scientific journals and conference proceedings, online news sources, and blogs or websites that cover developments in the field of artificial intelligence and machine learning. Additionally, you can check out the proceedings of major machine learning conferences, such as the International Conference on Machine Learning (ICML) and the Conference on Neural Information Processing Systems (NeurIPS), which are typically held each year and showcase the latest research in the field.
M: What is your prediction for machine learning field for 2023?
C: One trend that has been especially prominent in the field of machine learning in recent years is the increasing use of deep learning techniques, which have led to significant advances in a wide range of applications, including image and speech recognition, natural language processing, and autonomous systems. It is likely that this trend will continue in the coming years, as researchers continue to develop and refine deep learning algorithms and apply them to new and challenging problems.
Another trend that is likely to continue in the field of machine learning is the increasing importance of interpretability and explainability. As machine learning models become more sophisticated and are applied to more complex and sensitive tasks, it will be important to understand how they make their predictions and decisions, and to be able to explain these to users and stakeholders. Researchers and practitioners will therefore continue to focus on developing techniques for interpreting and explaining the behavior of machine learning models.
Finally, the field of machine learning will continue to be driven by the availability of large amounts of data and the increasing power and efficiency of computing hardware. As data continues to be generated at an ever-increasing rate, and as hardware capabilities continue to improve, it is likely that machine learning techniques will continue to advance and be applied to a wide range of problems and domains.
M: Ok, then write a program that draws a Christmas tree.
C:
Here is a Python program that uses ASCII art to draw a simple Christmas tree:
This code will output the following tree to the console:
/\
/ \
/ \
/ \
--------
You can customize the appearance of the tree by modifying the print statements. For example, you could add more branches to the tree by adding additional print statements, or you could use different characters to create a more detailed or decorative tree.
Here is an example of a more detailed tree that uses a combination of ASCII art and Unicode characters to create a more realistic-looking tree:
Summary: well, the conversation is not very deep, not great either, which makes me sleep better at nights, feeling we (software engineers) are still needed in 2023. At least for the time being.
Understanding programming language is an important topic in research in the area of programming language models. I’ve written before that there are ca. 50 programming language models, which we can use in software engineering. Ok, not all of them are equivalent and they are specific to the task, but they are available, so we can use and customize them.
Now, whether 50 models is a lot or not is debatable. Compared to natural language models this is a small number. Even compared to the number of programming languages this number is not impressive. However, how many languages are used widely – 10-15? Java, C, C++, Python, JavaScript, Rust, Go, and derivatives are the most common ones.
This article is a study done by our colleagues from the department. It’s too long to quote in detail, but there are a few things that I like. First, it’s a good overview of the types of language models:
Tree-based representation: when the program code is seen from the perspective of their Abstract-Syntax-Tree, an example is the code2vec model: code2vec
Graph-based models: when the program code is seen as a directed graph, e.g., a control flow graph
Although I like this classification, I see that it misses one of the most prominent and the most popular one – the NLP based model. It is a type of model where the program code is seen as a set of sentences that have meaning of some sort. It is a derivative of the token-based representation, but it is much more than that. CodeX from OpenAI is an example of such model.
Nevertheless, this study provides a very interesting set of examples of models and their applications. I sincerelly suggest to take a look at this paper to understand how the models work and where they are used best.
Some of you may not know, but I started my career as a software tester, so I’ve done my share of defect tracking and fixing. Although it was a while ago (well, over 20 years ago to be frank), I still remember a thing or two. I guess it is like riding a bike. One thing that I remember is that we did not really need more tests, but smarter testing.
This paper, nevertheless, proposes a new type of testing – inline testing – which is supposed to replace using printf(…) in code. Instead of printing values of variables for debugging purposes, we can use the new framework to create such small inline tests and execute them. The idea is simple and contributes to the maturity of our discipline.
By using inline tests, we can track the progress of our software development and its quality evolution. Since we can generate reports and use asserts, we could communicate our progress to quality management in a much better way.
I need to test this framework, especially that it works with Python, my new language of choice…
I’ve written about programming language models before, and it is no secret that I am very much into this topic. I like the way in which software engineering evolves – we become a more mature discipline and our tools become smarter by the hour (at least that’s how it feels).
This paper presents a new language model that is capable of doing code edits, i.e., such things as bug fixes. The model is essentially a transformer with an architecture that has been published before. However, the strength of this model is in the way in which it is trained. It uses so-called edit plans to train the model to change the input code, rather than to complement it.
The difference may not sound like much, but it is significant. The existing models are trained to complete code sequences and therefore they are very good in generating code. However, when given a code that does not require any generation, they tend to copy the input sequence to the output sequence. Well, not very useful that is.
Thanks to this new way of training, the model is able to edit code, remove defects, address review comments and so on. Yes, address review comments, this is not a joke. I sincerely believe that we can use this in practice in our tools one day.