Image by F. Muhammad from Pixabay
Machine learning and deep learning are only as good as the data used to train them. However, even the best data sources can lead to data of non-optimal quality. Noise is one of the exampes of the data problems.
Our research team has studied the impact of noise on machine learning in software engineering – mostly on the testing data. In this paper we present one techniques to identify noise, measure it and reduce it. There are several techniques to do it, but we use one of the more robust ones – removal of noise.
I recommend to take a look at how the algorithms work and let us know if you find it interesting!