Engineering machine learning systems is much more than train-evaluate cycles. It means that we need to systematically integrate these ML systems with the rest of the component. We need to build safety-cages to ensure that the decisions are not out-of-bounds and we need to make sure that we can maintain these systems.
In this paper, the authors studied an example of automated driving vehicles, not fully autonomous (but still) and shown the challenges that we need to solve before AI and ML becomes one of our “fellow drivers” on the roads.
The findings of the paper show that it’s not going to happen soon. As the authors say in the abstract: “Our results show that machine learning models are characterized by a lack of requirements specification, lack of design specification, lack of interpretability, and lack of robustness. We also perform a gap analysis on a conventional system quality standard SQuaRE with the characteristics of machine learning models to study quality models for machine learning systems. We find that a lack of requirements specification and lack of robustness have the greatest impact on conventional quality models. “
The authors provide a process for machine learning models as part of safety critical software, where the designing of the system and its real-scenario validation are a bit more apart than traditionally.
What I really like about the paper is the gap analysis of ML systems and ISO 25000 quality model. For example, as shown in this table: https://link-springer-com.ezproxy.ub.gu.se/article/10.1007/s10994-020-05872-w/tables/2