Sketches to models…

Image by 127071 from Pixabay

It’s been a while since I worked with models and I looked a bit at how things have evolved. As I remember, one of the major problems with modelling was one of its broken promises – simplicity.

The whole idea with modelling was to be able to sketch things, discuss candidate solutions and then to transfer them on paper. However, in practice, this never worked like that – the sheer process to transfer a solution from the whiteboard to a computer took time. Maybe even so much time that it was not really worth the effort of informal sketches.

Now, we have CNNs and all kinds of ML algorithms, so why not use that? This paper studies exactly this.

The paper “SkeMo: Sketch Modeling for Real-Time Model Component Generation” by Alisha Sharma Chapai and Eric J. Rapos, presents an approach for automated and real-time model component generation from sketches. The approach is based on a convolutional neural network which can classify the sketches into model components, which is integrated into a web-based model editor, supporting a touch interface. The tool SkeMo has been validated by both calculating the accuracy of the classifier (the convolutional neural network) and through a user study with human participants. At the moment, the tool supports classes and their properties (including methods and attributes) and relationships between them. The prototype also allows updating models via non-sketch interactions with models. During the evaluation the classifier performed with an average precision of over 97%. The user study indicated the average accuracy of 94%, with the maximum accuracy for six subjects of 100%. This study shows how we can successfully employ machine learning into the process of modeling to make it more natural and agile for the users.

Modelling digital twins…

Image by 652234 from Pixabay

Digital twins are becoming increasingly important. They provide a possibility to monitor their real twin without the need for costly measurements and sending technicians to the site where the real twin is located. However, development of them is not so easy and is almost one-off for every twin pair.

The paper “A Model-driven Approach for Knowledge-based Engineering of Industrial Digital Twins” presents a new approach to constructing digital twins for factories. Authored by Sushant Vale, Sreedhar Reddy, Sivakumar Subramanian, Subhrojyoti Roy Chaudhuri, Sri Harsha Nistala, Anirudh Deodhar, and Venkataramana Runkana, it introduces a method that enhances efficiency of monitoring and predictive maintenance of industrial plants.

Typically, digital twins are created manually for each plant, which is a labor-intensive process. This paper proposes a model-driven method, structured on three levels of abstraction: the meta-level, plant-type level, and plant-instance level. The meta-level outlines universal structures and vocabulary, the plant-type level focuses on knowledge specific to various plant types, and the plant-instance level details a digital twin for a specific plant. These levels correspond to different user roles: platform builders, plant type experts, and plant experts, respectively. This hierarchical structure enables element reuse across different plants and types, streamlining the digital twin development process. The effectiveness of this method is exemplified in a case study of an iron ore sinter plant.

The process begins with establishing high-level Key Performance Indicators (KPIs) such as sinter throughput or reduction degradation index. These KPIs are then translated into a mathematical model, followed by a causal graph, and finally, a digital twin design/model. Remarkably, this approach significantly reduced the time required to formulate the quality optimization problem to approximately one week, down from two months, marking a substantial improvement in efficiency. In conclusion, this paper demonstrates the substantial advantages of a multi-level modeling approach in designing digital twins, offering a more efficient, standardized, and scalable solution.

Generating documentation from notebooks

Understanding code is the same regardless if it is in a Jupyter notebook or if it is in another editor. Comments and documentation is the key. I try to teach that to my students and, some of them at least, appreciate it. Here is a paper that can change this to the better without adding to more effort.

This paper introduces a machine learning pipeline that automatically generates documentation for Python code cells in data science notebooks. Here’s a more casual summary of what they did and found:

  1. The Solution – Cell2Doc: The team whipped up a new tool called Cell2Doc. It’s a smart pipeline that breaks down code cells into logical parts and documents each bit separately. This way, it gets more details and can explain complex code better than other tools.
  2. How It Works: Cell2Doc has two main parts. First, a Code Segmentation Model (CoSEG) chops up the code into chunks that make sense on their own. Then, a Code Documentation Model (CoDoc) writes up explanations for each chunk. In the end, you get a full set of docs that covers everything the code is doing.
  3. The Cool Part: This isn’t just about slapping together existing models. Cell2Doc actually makes them better at writing docs for code. It’s like giving a turbo boost to the models so they can catch more details and write clearer explanations.
  4. Testing It Out: They didn’t just build this and hope for the best. They tested it with real data from Kaggle, a place where data scientists hang out and compete. They even made a new dataset for this kind of task because the old ones weren’t cutting it.
  5. The Results: When they put Cell2Doc to the test, it did a bang-up job. It scored way higher on automated tests than other methods, and real humans liked it better too. It was better at being correct, informative, and easy to read.
  6. Sharing Is Caring: They’re not keeping this to themselves. They’ve shared Cell2Doc so anyone can use it to make their code easier to understand.

In a nutshell, Cell2Doc is like a super-smart assistant that takes the headache out of writing docs for your code. It understands the code deeply and explains it in a way that’s easy to get, which is pretty awesome for keeping things clear and making sure your work can be used by others.

I consider to give this tool to my students in the sping when they learn how to program embedded systems in C.

Log files and anomalies, once again…

I’ve written about log files a while back, but I think I’m getting hooked up onto the topic. It is actually quite interesting how to use it in practice. So, here is one more paper from the ASE 2023 conference.

This paper presents a new way to create log data that can help spot problems in software systems. Here’s a more casual rundown of what the paper is about:

  1. The Problem: Keeping software reliable is tough, especially when you don’t have enough good examples of system logs to train your anomaly detection tools. The logs you can get your hands on either have privacy issues or are too simple and don’t reflect real-world complexity.
  2. The Solution – AutoLog: The researchers have cooked up AutoLog, a clever method that doesn’t need to run the actual system to generate logs. It’s like a simulation game that creates realistic log data by analyzing the code of an application.
  3. How AutoLog Rolls: It works in three steps. First, it digs through the code to find all the spots where logs might happen. Then, it figures out which parts of the code could lead to those logs. Finally, it walks through these paths, creating log data that looks like it came from a real running system.
  4. The Cool Bits: AutoLog can make a lot more log events than other methods, and it does it super fast. It’s like having a log event factory that can churn out thousands of messages a minute.
  5. Flexibility for the Win: You can tweak AutoLog to simulate different scenarios, like changing the amount of data, the mix of normal and weird events, or focusing on specific parts of the system.
  6. Real-World Ready: When tested on 50 Java projects, AutoLog’s logs helped anomaly detection tools perform a bit better. It’s like giving a detective better clues to solve a case.
  7. Sharing is Caring: The team has shared AutoLog for others to use, hoping it’ll help make software more reliable by giving developers better tools for testing and benchmarking.

In short, AutoLog is a new tool for creating fake but realistic logs that can help find bugs in software without the need to mess with privacy or oversimplified data. It’s a game-changer for making sure software runs smoothly.

I need to take this tool for a spin during the upcoming break.

Vulnerability detection – addressing the #1 problem

One of the major issues with vulnerability detection in source code is the unbalanced data. Although there is a lot of known vulnerabilities, the examples of them are rather scarce. SonarQube, as a tool, can detect only ca. 30 vulnerabilities out of over 200,000 existing ones. This paper is about making the job of finding security holes in software code easier and more reliable, even when there’s not a lot of clear-cut examples of what’s bad and what’s not. The main part of the paper is about:

  1. The PILOT model: The researchers came up with a smart model named PILOT that only needs examples of risky code and a bunch of other code where we don’t know if it’s safe or risky. It’s like having a detective who’s really good at spotting something fishy with just a few clues.
  2. How PILOT Works: PILOT has two cool tricks up its sleeve. First, it’s got a keen eye for picking out which pieces of the “unknown” code are probably safe. Second, it learns to tell the difference between safe and risky code in a way that’s not thrown off by a few mistakes in the data.
  3. The Proof is in the Pudding: They tested PILOT with real-world data and found it did a better job than other methods, even when those methods had more information to go on. PILOT was also pretty good at catching mistakes in the data where something was labeled as safe but was actually risky.
  4. Why It Matters: This approach is a game-changer because it means you can still get good at finding security risks even if you don’t have a ton of well-labeled data. It’s like being able to train a super sniffer dog with only a few scents rather than the whole scent library.

In essence, PILOT is like a detective that doesn’t need the whole story to solve the case. It can make do with just the good bits and still crack the code on what’s a security risk and what’s not.

Libraries and security

I often use python because of the large ecosystem of libraries. Thanks to these libraries, I do not have to focus on the details of the implementation, but I can focus on the task at hand. However, not all libraries are good, and therefore this paper captured my attention. The study aims to understand the characteristics and lifecycle of malicious code in PyPI by building an automated data collection framework and analyzing a dataset of malicious package files.

Key findings and contributions of the paper include:

  1. Empirical Analysis: The authors conducted an empirical study to understand the characteristics and lifecycle of malicious code in the PyPI ecosystem.
  2. Automated Data Collection: They built an automated data collection framework to gather a high-quality dataset of malicious code from PyPI mirrors and other sources.
  3. Dataset Construction: The dataset includes 4,669 malicious package files, making it one of the largest publicly available datasets of PyPI malicious packages.
  4. Classification Framework: An automated classification framework was developed to categorize the collected malicious code into different types based on their behavior characteristics.
  5. Malicious Behavior: The research found that over 50% of the malicious code exhibits multiple malicious behaviors, with information stealing and command execution being particularly prevalent.
  6. Novel Attack Vectors and Anti-Detection Techniques: The study observed several novel attack vectors and anti-detection techniques used by malicious code.
  7. Impact on End-User Projects: It was revealed that 74.81% of all malicious packages successfully entered end-user projects through source code installation, increasing security risks.
  8. Persistence in Mirror Servers: Many reported malicious packages persist in PyPI mirror servers globally, with over 72% remaining for an extended period after being discovered.
  9. Lifecycle Portrait: The paper sketches a portrait of the malicious code lifecycle in the PyPI ecosystem, reflecting the characteristics of malicious code at different stages.
  10. Suggested Mitigations: The authors present some suggested mitigations to improve the security of the Python open-source ecosystem.

The study is significant as it provides a systematic understanding of the propagation patterns, influencing factors, and potential hazards of malicious code in the PyPI ecosystem. It also offers a foundation for developing more efficient detection methods and improving the security practices within the software supply chain.

Understanding log files…

Debugging and testing often require analyses of log files. This means that we need to read a lot of lines of information that can be useful, but at the same time it is difficult to parse it. Therefore, this paper is of interest for those who must read these files once in a while.

This paper investigates the readability of log messages in software logging. The authors conducted a comprehensive study involving interviews with industrial practitioners, manual investigation of log messages in open-source systems, online surveys, and the exploration of automatic classification of log message readability using machine learning.

Key findings and contributions of the paper include:

  1. Practitioners’ Expectations (RQ1): Through interviews, the authors identified three aspects related to log message readability: Structure, Information, and Wording. They also derived specific practices to improve each aspect. Survey participants acknowledged the importance of these aspects, with Information being considered the most critical.
  2. Readability in Open Source Systems (RQ2): A manual investigation of log messages from nine large-scale open-source systems revealed that 38.1% of log messages have inadequate readability, particularly in the aspect of Information.
  3. Automatic Classification (RQ3): The study explored the use of deep learning and machine learning models to automatically classify the readability of log messages. The models achieved a balanced accuracy above 80% on average, indicating their effectiveness.

The paper’s contributions are significant as it is one of the first studies to investigate log message readability through interviews with industrial practitioners. It highlights the prevalence of inadequate readability in log messages within large-scale open-source systems and demonstrates the potential of machine learning models to classify log message readability automatically.

The study provides systematic comprehension of log message readability and offers empirically-derived guidelines to improve developers’ logging practices. It also opens avenues for future research to establish standards for composing log messages.

The authors conclude that their study sheds light on the importance of log message readability and provides a foundation for future work to improve logging practices in software development.

Ethics in data mining

BIld av Tumisu från Pixabay

A lot of software engineering research studies use open source data and mine software repositories. It’s a common practice since it allows to test our hypotheses before asking for previous resources from our collaborating companies. By mining open source data we can also learn whether our study makes sense; we can see it as a pilot study of some sorts.

Mining software repositories has evolved into a popular activity since we got access to repositories like Github. There are even guidelines for assessing this kind of studies, e.g., and we have regulations of what we can do with the open source data – these can be in the form of a license, law (like GDPR or the CCPA) or the need for asking an ethical board for an approval. However, there is also a common sense – not everything that is legal is appropriate or ethical. We always need to ensure that no individual can be a subject to any harm as a result of our actions.

In the article that I want to bring up today, the authors discuss the ethical frameworks for ethical software engineering studies based on open source repositories. We need to make sure that:

  1. We respect the persons, which stresses the need for approval and consent.
  2. Beneficence, which means that we need to minimize the harm, but maximize the benefit.
  3. Justice, which means that we need to consider each individual equally.
  4. Respect for law and public interest, which entails conducting due diligence on which data we can use and in which way.

The most interesting part of this article is the analysis of different cases of mining software repositories. For example, the case of analyzing the code, reviews, commit messages and other types of data in the repositories.

I recommend this article for everyone who considers working with mining software repositories.

What the GPT technology really changes in SE

Image source: pixabay

GPT technology, exemplified by the Github Copilot and its likes, changes software engineering to the ground. There is no doubt that the technology places a new tool in our engineering shed. It allows us to create software with a completely different set-up than what we are used to.

Now, what it really changes is only a few things, but these are very big ones.

  1. Programmers —> designers and architects. GPT can write source code like no other tool on the market. And it only gets better at this. A quick glimpse at the Github Next website gives us a good understanding that this team has only got started. This changes everything we know about engineering software. Bad programmers will disappear over time. Good software designers, architects and software engineers will take their place. They will be fewer in number, but better in quality.
  2. Software development —> software engineering. Designers will no longer get stuck in solving a small bit of a puzzle. GPT will do it for them. Instead of thinking how to write a test case, the designers will think how to test the software in the best possible way. They will focus on the engineering part of the software engineering. Something that I’m teaching my students from day one.
  3. Consultancy —> knowledge hubs. Since programming will become easier and more approachable, we will need people who know how to solve a problem, not how to write a program. This big chunk of business of the consultancy companies will disappear. The consultancy companies will specialize in their domains and in problem-solving.

There will also be other things that will happen. Requirements will not be the same as they are. Testing will be different, architecting will be smarter and management more optimal. Knowledge will be more valued and critical thinking will be needed even more.

Well, this is my end of the academic year blog post. More to come after the summer. Stay safe!

Continuous deployment in systems of systems…

Continuous deployment in software-intensive system-of-systems – ScienceDirect (

Interestingly, this is a paper from colleagues of ours from the department. The paper presents how one company – Ericsson – works with continuous deployment of their large software system in 3G RAN (Radio Access Networks). The highlights from the article are as follows:

  • New software field testing and validation activities become continuous.
  • Software deployment should be orchestrated between the constituent system.
  • A pilot customer to partner with is key for success.
  • Companywide awareness and top management support are important.
  • Documentation and active monitoring are critical for continuous deployment.

I like this paper because it presents a practical approach and a good set of practices that can be taken up by other companies.