Quality of Deep Learning code – article review

A (deep) Staircase in Vatican, Image by JEROME CLARYSSE from Pixabay


Deep learning models are often designed, trained and tested in Python. It is a language with a nice structure, quite straigtforward syntax and a lot of libraries. However, very few tutorials about deep learning (or any Python programming tutorials) discuss the quality of the code, e.g. its modularization, encapsulation, naming consistency.

As a result, a lot of code for machine learning, written in Python, often is hard to read and hard to grasp. Even if used as part of jupyter notebooks, the code is not really commented (often).

The study behind the link above is a study that supports my long gut feeling about this. The findings show that (from the abstract): First, long lambda expression, long ternary conditional expression, and complex container comprehension smells are frequently found in deep learning projects. That is, deep learning code involves more complex or longer expressions than the traditional code does. Second, the number of code smells increases across the releases of deep learning applications. Third, we found that there is a co-existence between code smells and software bugs in the studied deep learning code, which confirms our conjecture on the degraded code quality of deep learning applications.

The second finding, about the constant increase of the number of code smells, is similar to the studies we did in proprietary software about complexity – the complexity “never” decreases ( http://web.student.chalmers.se/~vard/files/Monitoring%20Complexity%20Evolution.pdf ).

The study compares 59 deep learning systems with 59 non-ML systems from GitHub. One could argue that the sample is not representative (no propprietary systems), but it is a fair sample.

To sum up, a very nice reading, showing that we need to think about quality, not only models, but also code quality.

Evidence of improvement using Agile…

Towards the end of the year I’d like to make a small reflection on Agile software development. It’s been discussed for a number of years now, yet the evidence of bringing measurable results is rather scarce. Here is one article from Åby Academy in Finland which studies a transformation of a large company to Agile: https://www.researchgate.net/profile/Marta_Olszewska_Plaska/publication/280711876_Did_it_actually_go_this_well_a_Large-Scale_Case_Study_on_an_Agile_Transformation/links/55c1d7ea08aeb28645819d3f.pdf

Studied case: Ericsson

Size: ca. 350 people

Product: roughly 10 years old

Languages: RoseRT, C++, Java

Summary of results: Agile software development provided more features (5x) and faster (60%).

What I like about the paper is that it provides the measurement before the transformation, DURING the transformation and after. Very interesting reading!

Do SysML requirement diagrams help?

Today I’ve had a privilege to present a paper at EASE 2014 done in collaboration with University of Basilicata in Italy.

Link to presentation

The paper is an experimental validation of whether requirement diagrams speed up the understanding of requirement specifications or whether they increase/decrease comprehension. The results show that the comprehension is increased while there is no change in time.

Evolution of Long-Term Industrial – new paper

Darko Durisic has done an interesting work on the evolution of industrial-class meta-models. The work has been accepted as full paper at SEAA (Software Engineering for Advanced Applications) Euromicro Conference.

Title: Evolution of Long-Term Industrial Meta-Models – A Case Study

Abstract: Meta-models in software engineering are used to define properties of models. Therefore the evolution of the metamodels influences the evolution of the models and the software instantiated from them. The evolution of the meta-models is particularly problematic if the software has to instantiate two versions of the same meta-model – a situation common for longterm software development projects such as car development projects. In this paper, we present a case study of the evolution of the standardized meta-model used in the development of the automotive software systems – the AUTOSAR meta-model – at Volvo Car Corporation. The objective of this study is to assist the automotive software designers in planning long term development projects based on multiple AUTOSAR meta-model versions. We achieve this by visualizing the size and complexity increase between different versions of the AUTOSAR meta-model and by calculating the number of changes which need to be implemented in order to adopt a newer AUTOSAR meta-model version. The analysis is done for each major role in the Automotive development process affected by the changes.

Stay tuned for the full version of the paper and congrats to Darko!

Choosing reliability growth models…

In our recent research we’ve looked at a number of ways on how to support software development companies in working with reliability modelling.

I have come across this article on how to choose a model – a sys rev. They authors look at a number of criteria and evaluate which ones are the most used in choosing models. Nice and interesting reading.

Link to full text

Article highlight: Measures and external quality…

Article highlight: Empirical evidence on the link between object-oriented measures and external quality attributes: a systematic literature review
Link to full text

This article presents an interesting systematic review where the authors set off to look for evidence of correlation between OO metrics and quality. What I like about this paper:

  • nice overview of which metrics suites exist for OO programs
  • nice overview which external quality metrics are used
  • essentially only 99 studies exist which have the right scope and quality
  • the number of studies seem to be growing – even in the past 2-3 years
  • the 20-year old CK suite is still the most popular one

Recommended reading for those who want to see which metrics are the best predictors, when and why.

How much does it cost to be ready with testing? (review)

How much does it cost to be ready with testing?

Link to full text

In our research work we stumbled upon a question of monitoring whether the product is ready to release (Staron et al, “Release Readiness Indicator for Mature Agile and Lean Software Development Projects”, XP 2012). We could identify indicators which could show how many weeks to release the organization have given their testing and development speed.

In this article we could see a complement to our work since it presents a cost model for how much testing is needed to achieve a specific release pace. Interesting work, waiting to be validated in industrial contexts.

Predicting risk of pre-release code changes…

Predicting risk of pre-release code changes with Checkinmentor

This recently published paper shows a very nice approach of monitoring of what kind of patterns in pre-release code changes can be risky w.r.t. fault proneness of software components. The paper shows experiences of analyzing Windows Phone software at Microsoft done in one of my favorite places – Microsoft Research.

In the context of this work the module is risky if it can cause a bug fix after the release and the metrics used are both those of a source code and of the organization behind the product development. They have found that the change size metrics are the most prominent ones. This means that the more code one checks in, the higher the risk of having a bug…

Link to full text

Choosing reliability growth model for open source software (new article review)

Choosing reliability growth model for open source software, online first from IEEE Computer

Link to full text at IEEE

Predicting the number of unknown defects has always been an important problem to solve. A lot has been done in the area and a lot will be done before the problem is solved.

This paper highlights different types of reliability models (e.g. Convex, Concave) and how to choose between them for open source projects. It’s a magazine article so it reads nicely and gives useful pointers. Recommended as Friday evening reading:)

Measures for the Lean Start-ups

During my development of course material for DIT595 (Industrial Best Practice) for the Bachelor Program in Software Engineering and Management I got inspired by the Lean Start-up by Eric Ries (Crown Business Publishing, theleanstartup.com). The book is a very good material for entrepreneurs willing to start their own businesses. It is also very good for our students who want to do their bachelor and master theses in industry.

I also did some extra search for more articles on how measuring should be done in the lean start-ups. I have found the following article with tips for creating metrics – 8 tips…. The article proposes the following:

    • Be actionable
    • Be understandable and trustworthy
    • Measure results
    • Understand the downside
    • Understand the upside

Since the article is free I will not quote more – I recommend reading it and reflecting upon the metrics that we create.