The speed of light and laws of software engineering…

Image by ParallelVision from Pixabay

While on vacation, I managed to watch a number of sci-fi movies, which I wanted to watch but did not have the time during the academic year. This got me thinking about certain laws of physics and the laws of software engineering. I think there are many similarities, and let me start by considering the speed of light, as a starter.

First of all, what we know about the speed of light is that it’s the fastest we know and that Einstein’s theory of relativity says that we cannot travel faster than the speed of light. Even if we were, such a speed is tremendously difficult.

  • How would we know where we travel if we cannot see where we travel (we go as fast as we can see, literally). So, the travel would be very fast and, probably, very short.
  • If we, somehow, see where we go, or detect an obstacle (e.g. by knowing its predicted position based on prior observations), how can we steer? We’re going so fast, that for the majority of the travel time, we would be going in straight lines. These straight lines would be similar to either the hyperjumps from Star Wars or the clicks from the Guardians of the Galaxy.
  • If we’re literally the fastest objects, how can others see us and avoid is? Is it possible to avoid the light? No, it’s not possible. This means that we would be super-chaotic.

Therefore, I think that, even if we could, traveling at the speed of light is probably not the best idea. At least not conceptually, which may change as we change.

How does it affect the laws of software engineering then. Well, I would start with the laws of complexity.

For a while now, I’ve been broadcasting the opinion that the complexity of software cannot be reduced, it can only be hidden. For complex problems, we need complex software and complex software cannot be simple. If our algorithms have many conditions, we cannot take them away, we can hide them in functions, but never get rid of them. That’s the first parallel – we cannot travel faster than the speed of light.

We can hide complexity, and thus make the program/software easier to understand and maintain, but the better we hide it, the harder it is to avoid/predict complexity. Packaging complex algorithms in simple blocks will make it difficult to make modifications. Actually, not to make the modifications, but to overview the consequences of these modifications.

If we simplify the program/algorithm too much, we need to expect that it’s going to provide erroneous results for some cases – again, complex problems require complex programs. An example of such issue is approximating continuous functions – since our computers are discrete, there is always some degree of error in such an approximation.

Finally, interconnectivity and modularity as a means of handling complexity have their limits. I do not think we can develop increasingly complex software by increasing its size. I believe it’s going to be difficult in the long run. We need to make sure that we have the competence to handle complexity and we need to be able to make the complexity apparent.

Comparing different security vulnerability detection techniques (article review)

Image by Reimund Bertrams from Pixabay

An Empirical Study of Rule-Based and Learning-Based Approaches for Static Application Security Testing (arxiv.org)

In the recent weeks I’ve turned into a specific part of my work, i.e. security vulnerability detection. In many areas, working with security has focused on the entire chain. And that’s a good thing – we need to understand when and where we have a vulnerability. However, that’s not what I can help with, which has never really stopped me before.

So, I was looking for more programmatic view on security. To be more exact, I wonted to know what we, as software engineers, need to focus on when it comes to cyber security. We can, naturally, measure it, but that’s probably not the only thing. We can analyze libraries from OSS communities to find which ones could be exploited. We can even program in a specific way to minimize the risk of the exploitation.

In this paper, the authors compare two different techniques for software vulnerability prediction – static software analysis and vulnerability prediction models. They have identified 12 different findings, of which the following are the most interesting ones:

  • SVP models are generally a bit better when it comes to precision and overall preformance.
  • SVP models provide fewer files to inspect as the output, which saves the cost.
  • The two approaches lack synergy, and it’s difficult to use them together to increase their performance.

Since they have compared only a few tools, I believe it’s important to do more experiments. It is also important to understand whether it is good or bad to have fewer files to inspect – I mean, one undetected vulnerability can be very costly…

autoML – let’s talk about it…

Image from Pixabay

AutoML, a promise of green pastures, less work, optimal results. So, it is like that? In this post I share my view on this and experience from running the first test using that model.

First of all, let’s be honest, there is not such thing as a free lunch. In case of autoML (auto-sklearn), the price tag comes first with the effort, skills and time to install it and make it work. The second is the performance…. It’s painfully slow compared to your own models, simply because it tests a lot of models here and there. It also take a lot of time to download and to make it work.

But, first thing first, let me tell you where I start. So, I used the data from the MicroHRV project ( 3. MicroHRV: Recognizing Rare Events in Microwave Radio Links and Intensive Care Units using Machine Learning – Software Center (software-center.se)). The data is from patients being operated to remove clots of blood from the brain (although dangerous it may sound, the actual procedure is planned and calm). I wanted to check whether autoML can do better compared to what we have at the moment.

What we have at the moment (for that particular dataset) is: Accuracy: 0.98, Precision: 0.98, Recall: 0.98 – using Random Forest classifier. So, this is actually already very good. For the medical domain, that’s actually in class of its own, given our previous studies ended up with ca. 0.7 in accuracy at best.

When it comes to installing autoML – if you like stackoverflow, downgrading, upgrading, compiling, etc. and run Windows 10, then it’s your heaven. If you run Linux – no problems. Otherwise – stick to manual analyses:)

After two days (and nights) of trying, the best configuration was:

  • WSL – Windows Subsystem for Linux
  • Ubuntu 20, and
  • countless of oss libraries

It takes a while to get it to work, the question is whether the results are good enough…

After three hours of waiting, a lot of heat from my laptop, over 1,000 models tested resulted in Accuracy: 0.91, Precision: 0.94, Recall: 0.91

So, worse than my manual selection of models. I include the confusion matrices.

AutoML
Random forest

The matrices are not that different, as the validation sets are not that large either. However, it seems that the RF is still better than the best model from autoML.

I need work more on that and see if I do something wrong. However, I take this as a success – I’m better than autoML (still some use of an old professor) – instead of a let-down of not getting better results.

By the end of the day, 0.98 in accuracy is still very good!

Reproducing AI models – a guideline

Image by Pete Linforth from Pixabay

2107.00821.pdf (arxiv.org)

Machine learning has been used in software engineering as a great tool for both research and development. The fact that we have access to TensorFlow, PyCharm, and other toolkits, provides almost endless possibilities. Combine that with the hundreds (if not thousands) of datasets from Zenodo and Co. and you can train a model for almost anything.

So far, so good, I would say. Problems (yes, there are always some problems) appear when we want to reproduce the results of others. Training a model on your own dataset and making it available is easy. Trusting such a model in a new context is not.

Imagine an example of an ML model trained on data from Company X. We have probably tuned the parameters a lot, so the model works great there, but does it work for Company Y? Most probably it will not. Well, it will work, but the performance of the predictions are not going to be great.

So, Google has partner up with academic partners to set up SIGMODELS, and TensorFlow garden, initiatives that are aimed at making ML models more portable, experiments more replicable, and all the other goodies.

In this paper, the authors provide a set of checks, which we can use to make the models more transparent, which is the first step towards reproducibility. In these guidelines, the authors advocate for reporting the models architecture, their input and output structure, building blocks, loss functions, etc.

Naturally, they also recommend to report metrics which were used to optimize the models, e.g. accuracy, F1-score, MCC or others. I know, these are probably essentials, but you would be surprised to see that many authors do not really report these metrics. If they are omitted, then how do we know if the metrics were just so poor that the authors omitted them (low performance of the model) or that they are not relevant (low relevance of the metrics – which is a good thing).

For now, these guidelines are only a draft, but I hope that they will become more mainstream. just like the emprical guidelines from ACM (GitHub – acmsigsoft/EmpiricalStandards: Empirical standards for conducting and evaluating research in software engineering).

Towards a brighter tomorrow…

Image by Reimund Bertrams from Pixabay

https://www.amazon.com/Enlightenment-Now-Steven-Pinker-audiobook/dp/B075F8M2MC/ref=sr_1_1?crid=2AEV0AL40X4JI&dchild=1&keywords=enlightenment+now&qid=1625822982&sprefix=Enlight%2Caps%2C245&sr=8-1

I’ve read an article the other day about the fact that we, as human beings, will be able to extend our lives only so much. I don’t remember the exact source, could be CNN or something like that, but the content was about the fact that we will never be able to stop aging or even death.

I also looked at one of the modern positive thinker – Steven Pinker – and his book “Enlightenment Now”. The book is similar, in its tone, to the work of the late Hans Rosling, providing a positive view of the development of humanity. I like this positive way of thinking, but, at the same time, I wonder about the potential new threats.

For example, new software technology requires more supervision. We need to be able to understand the risks with connectivity, e.g. cyber security, as well as be prepared for when the software stops working. And it will stop working at some point of time. The technology that we used in the 1990s is no longer functional. Well, yes we do have cars who are kept alive by the enthusiasts, but all the 1990s computers are in museums. Many kids do not even recognize that technology.

So, is the progress something that is always good? I would say that it is good in 80%. The remaining 10% is neutral and then 10% is negative. The negative 10% is the price we pay for the new things. New cars are electrical, but we need more energy, or energy which is stored in a different way. No more liquid energy, relatively easy to store, but the new, fast electron energy, which is volatile. It is fast, so we can quickly transfer it from a desert solar farm, but we cannot really store it. At least not as much as we need to power the entire society.

Nevertheless, I strongly recommend Pinker’s book about the progress of humanity. I believe that we are living in a progressive and cool world. In a better world compared to our ancestors and I believe that our kids will live in an even better world.

My personal take-away from this books is to be a better teacher, mentor and advisor. Make sure that my students enjoy the courses that I give and that these courses are of value to them, and to the society. I hope that my course in embedded software development will evolve and prepare the students to write better software for cars, telecom networks, water pumps, wind turbines and health equipment.

What will the future bring ….

https://www.amazon.se/Brief-Answers-Big-Questions-Stephen/dp/1473695996/ref=sr_1_1?dchild=1&keywords=Hawking&qid=1625726909&sr=8-1

As my summer goes on, I’ve decided to take a look at the book of one of my favorite authors and scientists – Prof. Stephen Hawking. I’ve loved his books when I was younger and I like the way he could bring difficult theories to the masses, like his famous “gray holes” as opposed to the black ones.

This book allowed me to reflect on some of the most common questions that people ask, even to me. Like whether AI will take over or whether we should even invest in AI. Since I’m not a physicist, I cannot answer most of the questions, but I think the AI question is something that I can even attempt.

So, will AI take over? Is GPT-3 something to worry about? Will we be out of work as programmers? Well, not really. I think that we live in a world that is very diverse and that we need human judgement to make sure that we can live on. Take the recent cyber attack on Kaseya, which is a US-based company with thousands of clients. The attack affected a minority of their clients, some 40 or so (if I remember the article correctly). However, it make the entire COOP chain in Sweden stranded. Food was given away for free as there was no way to take payments. Other grocery shops bought the grocery stock from the affected chain, to make sure people have enough food. So, what would AI do?

Let’s think statistically for a moment. 40 customers, out of ca. 40,000, is about 0.1 percent. So, ignoring this event would just make 99.9% accuracy for AI. Is this good? Statistically, this is great! Almost perfect. For AI, therefore, this would be like a great way of optimizing. Not paying ransom, make sure that 0.1% is taken as a negligible error somewhere.

Now, let’s think about the social value. Without knowing the rest of the customers affected, or even the ones that were not affected, I could say that the value of having food on your table trumps many other kinds of value. Well, maybe not your health, but definitely something like a car or a computer game. So, the societal impact of that is large. We could model that in the AI, but there is programmatic problem. How to calculate value of diverse things – a car or food. There is the monetary value, of course, but it’s not constant in time. For someone who is hungry for days, the value of a sandwich is infinitely larger than the value of the same sandwich for someone who has just eaten a delicious stake. Another problem is that the value is dependent on the location (is there another grocery shop close by?), your stock (which is individual and hard to find for AI), or even the ability to use another system of payment (can I just get my groceries and pay later?)

This example shows an inherent problem in finding the right data to use for AI. I believe that this is a problem that will not really be solved. And if it cannot be solved, I think I would like to pay a few cents extra to have a human in the loop. I would liked to know that there is an option, in the event if a hacker attack, to talk to a person who understands my needs and can help me. Give me food without paying, knowing where I live and that I will pay later.

Until there are models which understand us, humans, we need to be able to stick to having humans in the loop. Given that there are ca. 6 billion people in the world, potentially different, with conflicting needs, I do not things AI will be able to help us in critical parts of the society.

Thinking machines – summer reading

Tänkande maskiner: Den artificiella intelligensens genombrott: Häggström, Olle, Pettersson, Rasmus: Amazon.se: Books

Summer is around the corner, in some places it has actually arrived. This usually makes for relaxation and time for reflection.

I would like to recommend a book from fellow professors. The book is about the way in which thinking machines are considered today and about the potentials of GAI, General AI. The book is written in a popular science way, but the examples and the research behind this is solid. The authors discuss both the technology, but also the society – legislative aspects of AI and understanding of what it means to be thinking in general, and in some specific cases tool.

Have a great summer and stay tuned for new posts after the summer.

New from ML?

I seldom write about films and events, well maybe actually never, but this year, a lot has happened in the online way.

What’s new in Machine Learning | Keynote – YouTube

The video above is about the news from Google about their TensorFlow library, which include new ways of training models, compression and performance tuning and more.

TensorFlow Light and TensorFlow JS allow us to use the same models as for desktops, but on mobile devices. Really impressive. I’ve caught myself thinking whether I’m more impressed by the hardware capabilities of small devices, or the capabilities of software. Either way – super cool.

Google is not the only company announcing something. NVidia is also showing a lot of cool features for enterprises. Cloud access for rapid prototyping, model testing and deployments are in the center of that.

NVIDIA Executive Keynote for Enterprise AI at COMPUTEX 2021 – YouTube

I like gaming, so this is impressing, but even more impressive is to look at the last-year’s DLSS technology, which still cannot be beaten by the competition. Really nice.

Assessing the maturity of a metric team

MeTeaM – Metrics Team Maturity Model

https://doi.org/10.1016/j.jss.2021.111006

Modern software development organizations work a lot with metrics – they use them to focus their development effort, to understand their products, and, recently, to guide their online experimentation.

We’ve studied software measurement programs before, mostly from the perspective of what’s in the program – what is measured, how it is measured, who is responsible for the measurement and when the measurement is conducted.

In this article, we focus on the “softer” aspects of the measurement program – how mature is a metric team, and how mature is the organization where the metric team works. When studying four different companies, aka four different teams, we could observe that the maturity of the team goes hand-in-hand with the maturity of the organization. Mature organizations can take metrics and indicators into their workflows, and therefore increase the motivation to learn in the metrics team.

We’ve also found a few key factors for success of the metric team:

  1. Formal, official mandate – which prevents questioning of the metric team, also important to make the team become a team, not a group of individuals
  2. Link to the organization – a team needs to be part of a product or service development, they need to understand what is required from them.
  3. Longevity – a team needs to be there for the long run, as the short term improvements will not last for very long (per definition), and therefore the motivation is not going to last.

I hope that this article, which is open access, will help others to understand how to evolve their teams and measurement programs.

New programming tools?

Glinda: Supporting Data Science with Live Programming, GUIs and a Domain-specific Language (acm.org)

I’m not going to add a picture here, because the actual paper contains a great picture, which is copyrighted. But, do we need another tool (I though), and if you think like that… well, think again.

Once I looked at the paper, I really liked the idea. This is a tool that combines the programming tasks of software engineers and such tasks like data exploration, labelling or cleaning. It’s a kind of tool like Jupyter Notebook, but it allows to interact with the data in a deeper way.

I strongly recommend to take a look at the tool. I’ve done a quick check and it looks really nice.