Quality of Deep Learning code – article review

A (deep) Staircase in Vatican, Image by JEROME CLARYSSE from Pixabay

http://swat.polymtl.ca/~foutsekh/docs/hadhemi-MSR2020.pdf

Deep learning models are often designed, trained and tested in Python. It is a language with a nice structure, quite straigtforward syntax and a lot of libraries. However, very few tutorials about deep learning (or any Python programming tutorials) discuss the quality of the code, e.g. its modularization, encapsulation, naming consistency.

As a result, a lot of code for machine learning, written in Python, often is hard to read and hard to grasp. Even if used as part of jupyter notebooks, the code is not really commented (often).

The study behind the link above is a study that supports my long gut feeling about this. The findings show that (from the abstract): First, long lambda expression, long ternary conditional expression, and complex container comprehension smells are frequently found in deep learning projects. That is, deep learning code involves more complex or longer expressions than the traditional code does. Second, the number of code smells increases across the releases of deep learning applications. Third, we found that there is a co-existence between code smells and software bugs in the studied deep learning code, which confirms our conjecture on the degraded code quality of deep learning applications.

The second finding, about the constant increase of the number of code smells, is similar to the studies we did in proprietary software about complexity – the complexity “never” decreases ( http://web.student.chalmers.se/~vard/files/Monitoring%20Complexity%20Evolution.pdf ).

The study compares 59 deep learning systems with 59 non-ML systems from GitHub. One could argue that the sample is not representative (no propprietary systems), but it is a fair sample.

To sum up, a very nice reading, showing that we need to think about quality, not only models, but also code quality.

Choosing reliability growth model for open source software (new article review)

Choosing reliability growth model for open source software, online first from IEEE Computer

Link to full text at IEEE

Predicting the number of unknown defects has always been an important problem to solve. A lot has been done in the area and a lot will be done before the problem is solved.

This paper highlights different types of reliability models (e.g. Convex, Concave) and how to choose between them for open source projects. It’s a magazine article so it reads nicely and gives useful pointers. Recommended as Friday evening reading:)

Agile metrics, Software Engineering Institute, review

Agile metrics in technology acquisition
Link to full text

Recently the Software Engineering Institute has published an interesting article on the use of Agile metrics in DoD contracts. They have defined a few metrics of interest:

    • Velocity – volume of work accomplished in a specified period of time
    • Sprint Burn-Down – progress for the development team during a sprint
    • Release Burn-Up – release readiness metrics

The report recognizes a number of advanced metrics and discusses their use and relation to the DoD standards, which makes a nice reading.

@MiroslawStaron