December 2023 – SE metrics (Software Engineering)

Generating documentation from notebooks

https://github.com/jyothivedurada/jyothivedurada.github.io/blob/main/papers/Cell2Doc.pdf

Understanding code is the same regardless if it is in a Jupyter notebook or if it is in another editor. Comments and documentation is the key. I try to teach that to my students and, some of them at least, appreciate it. Here is a paper that can change this to the better without adding to more effort.

This paper introduces a machine learning pipeline that automatically generates documentation for Python code cells in data science notebooks. Here’s a more casual summary of what they did and found:

The Solution – Cell2Doc: The team whipped up a new tool called Cell2Doc. It’s a smart pipeline that breaks down code cells into logical parts and documents each bit separately. This way, it gets more details and can explain complex code better than other tools.
How It Works: Cell2Doc has two main parts. First, a Code Segmentation Model (CoSEG) chops up the code into chunks that make sense on their own. Then, a Code Documentation Model (CoDoc) writes up explanations for each chunk. In the end, you get a full set of docs that covers everything the code is doing.
The Cool Part: This isn’t just about slapping together existing models. Cell2Doc actually makes them better at writing docs for code. It’s like giving a turbo boost to the models so they can catch more details and write clearer explanations.
Testing It Out: They didn’t just build this and hope for the best. They tested it with real data from Kaggle, a place where data scientists hang out and compete. They even made a new dataset for this kind of task because the old ones weren’t cutting it.
The Results: When they put Cell2Doc to the test, it did a bang-up job. It scored way higher on automated tests than other methods, and real humans liked it better too. It was better at being correct, informative, and easy to read.
Sharing Is Caring: They’re not keeping this to themselves. They’ve shared Cell2Doc so anyone can use it to make their code easier to understand.

In a nutshell, Cell2Doc is like a super-smart assistant that takes the headache out of writing docs for your code. It understands the code deeply and explains it in a way that’s easy to get, which is pretty awesome for keeping things clear and making sure your work can be used by others.

I consider to give this tool to my students in the sping when they learn how to program embedded systems in C.

Log files and anomalies, once again…

https://arxiv.org/pdf/2308.09324.pdf

I’ve written about log files a while back, but I think I’m getting hooked up onto the topic. It is actually quite interesting how to use it in practice. So, here is one more paper from the ASE 2023 conference.

This paper presents a new way to create log data that can help spot problems in software systems. Here’s a more casual rundown of what the paper is about:

The Problem: Keeping software reliable is tough, especially when you don’t have enough good examples of system logs to train your anomaly detection tools. The logs you can get your hands on either have privacy issues or are too simple and don’t reflect real-world complexity.
The Solution – AutoLog: The researchers have cooked up AutoLog, a clever method that doesn’t need to run the actual system to generate logs. It’s like a simulation game that creates realistic log data by analyzing the code of an application.
How AutoLog Rolls: It works in three steps. First, it digs through the code to find all the spots where logs might happen. Then, it figures out which parts of the code could lead to those logs. Finally, it walks through these paths, creating log data that looks like it came from a real running system.
The Cool Bits: AutoLog can make a lot more log events than other methods, and it does it super fast. It’s like having a log event factory that can churn out thousands of messages a minute.
Flexibility for the Win: You can tweak AutoLog to simulate different scenarios, like changing the amount of data, the mix of normal and weird events, or focusing on specific parts of the system.
Real-World Ready: When tested on 50 Java projects, AutoLog’s logs helped anomaly detection tools perform a bit better. It’s like giving a detective better clues to solve a case.
Sharing is Caring: The team has shared AutoLog for others to use, hoping it’ll help make software more reliable by giving developers better tools for testing and benchmarking.

In short, AutoLog is a new tool for creating fake but realistic logs that can help find bugs in software without the need to mess with privacy or oversimplified data. It’s a game-changer for making sure software runs smoothly.

I need to take this tool for a spin during the upcoming break.

Vulnerability detection – addressing the #1 problem

https://arxiv.org/pdf/2308.10523.pdf

One of the major issues with vulnerability detection in source code is the unbalanced data. Although there is a lot of known vulnerabilities, the examples of them are rather scarce. SonarQube, as a tool, can detect only ca. 30 vulnerabilities out of over 200,000 existing ones. This paper is about making the job of finding security holes in software code easier and more reliable, even when there’s not a lot of clear-cut examples of what’s bad and what’s not. The main part of the paper is about:

The PILOT model: The researchers came up with a smart model named PILOT that only needs examples of risky code and a bunch of other code where we don’t know if it’s safe or risky. It’s like having a detective who’s really good at spotting something fishy with just a few clues.
How PILOT Works: PILOT has two cool tricks up its sleeve. First, it’s got a keen eye for picking out which pieces of the “unknown” code are probably safe. Second, it learns to tell the difference between safe and risky code in a way that’s not thrown off by a few mistakes in the data.
The Proof is in the Pudding: They tested PILOT with real-world data and found it did a better job than other methods, even when those methods had more information to go on. PILOT was also pretty good at catching mistakes in the data where something was labeled as safe but was actually risky.
Why It Matters: This approach is a game-changer because it means you can still get good at finding security risks even if you don’t have a ton of well-labeled data. It’s like being able to train a super sniffer dog with only a few scents rather than the whole scent library.

In essence, PILOT is like a detective that doesn’t need the whole story to solve the case. It can make do with just the good bits and still crack the code on what’s a security risk and what’s not.