Numalis

Saimple : detecting a measurement bias

Publications

16 décembre 2021

This use case will highlight the detection of a measurement bias and the provision of an answer through data augmentation.

Measurement biases can occur in any dataset and are not easy to detect. Saimple helps to detect these biases and data augmentation can be a solution to resolve them.

Indeed, data augmentation is a key element to artificially enrich datasets. However, one should be aware that not every data augmentation is beneficial to the network. Knowing the difference between good and bad data augmentation is also a challenge. Saimple helps to tell the difference and to understand the impact of each data augmentation on deep learning models.

1. Use case presentation:

An initial objective of this use case was to compare the performance of a model on generated data against real world data. To do this, the study has used images of mechanical parts as data. The images are grouped into 4 categories (which will be the classes of the model) :
1 - 'nut' : nuts, in various shapes
2 - 'bolt' : screws of different shapes and lengths
3 - 'washer' : some washers
4 - 'locatingpin' : locatingpin of different shapes and lengths

All images of the initial dataset are 3D representations of mechanical parts, there are no "real part" images. The dataset contains 1904 images of each class.

Example of some representations of mechanical parts constituting the dataset

Using this dataset, we intended to highlight a sample bias. However, when analysing the results, we realised that there was also a Measurement bias in the data. Here we will show how we found and resolved this with Saimple.

Definitions:

- Mesurement bias : They are related to the measurement tools used to collect the data. The acquisition of data with a single tool, or the storage of data, can introduce a measurement bias. For example, if photos are taken with a single type of camera, and that camera includes a watermark on each of the photos, a measurement bias may occur.

- Sample bias : Sample bias can occur when the training data does not reflect the reality or the environment in which the model is to be evaluated. The most obvious example of this type of bias is a face detection model that has only trained on images of white males; the model would be biased because it does not represent women or different ethnicities.

Model construction:

In this use case, we will try to classify images according to 4 classes, using convolutional neural network.

The learning on the initial dataset seems to be efficient. After 10 epochs, the model reaches almost 95% accuracy on the validation data.

2. Analysis and results:

First, we test the performance of the model on images from the dataset. By testing on 6 images for each class, we obtain the following results : the accuracy on 24 images is 87.5%. The network is therefore quite rather. However, we note that it has difficulty to make difference between classes that are similar (locatingpin and bolt for example).

We will now look at the results of the model on real world data. We will retrieve images by scrapping images on Google.

The accuracy on 24 internet images is 20.8%. Since the model has 4 classes, it can be said that the model is not performing at all, indeed, an algorithm classifying the images randomly would achieve the same performance. The model is therefore not adapted to "real" images. Let's try to determine the cause thanks to Saimple.

Relevance study with Saimple:

Thanks to Saimple we can observe relevance masks and understand which pixels are important for the neural network’ decision. Doing so, we notice strange patterns of relevance.

Relevance masks for nuts

On any type of images (real images and 3D images from the dataset or not) we notice that the relevance forms some kind of spots on the image, but does not take into account the mechanical part at all. Moreover, the predictions of the model are only good with images coming from the dataset. One can therefore assume that there is a measurement bias in the images. The images may have been watermarked for copyright purposes.

Bias Detection:

We first do two checks that could reveal the presence of watermark :

Tattoo research on LSBs :

The image watermarking on the LSB (Least Significant Bit) is an invisible watermarking, so we will check that the images do not have this type of watermarking. To do this, we create functions in python allowing us to break down the image according to its 8 bits, from the least to the most significant. Visually, we don't notice anything abnormal on the different layers, so we assume that there is no watermarking on the LSBs.

Applying filters to images :

We then try to vary the different components of the image (contrast, luminosity, colours, etc.) to see if any marks stand out. Here again, nothing is detected with the naked eye.

We find nothing so we try to adapt the dataset in order to bypass this measurement bias.

Transformation of the dataset into greyscale :

It is first assumed that the bias comes from the RGB components of the image (watermarking on one or more RGB components). We will therefore convert the images into grey levels in order to remove the RGB component. We then create a second dataset containing the same images but converted into shades of grey. With Saimple, again we check relevance mask, to see if there are still spots or if the mechanical part is finally correctly analysed.

Screw relevance mask – greysacle dataset

The results obtained are slightly better (30% accuracy on real data). Moreover, the first relevance points seem to be correct. On the other hand, we notice that the Measurement bias is still present. The more points we take, the more spots appear. Switching to greyscale slightly improved the model's performance and reduced the dataset bias but did not completely eliminate it. The change to greyscale may not have eliminated the bias because this change is only made with a linear operation (Grayscale = 0.299R + 0.587G + 0.114 B).

Transformation of the dataset into black and white :

An attempt is then made to convert the dataset into black and white images. This conversion completely removes the bias in the RGB channels.

Nut relevance mask – black and white dataset

This conversion again improves the performance of the model on real world data (45% accuracy), but the accuracy on 3D images is lower (about 80% compared to 90% previously). The prediction results on real images are therefore better, but it is difficult to imagine that the network can achieve better performance.

Effectively, we lose a lot of information with this conversion, each pixel of the image is converted from 3 channel with 256 possible values (values from 0 to 255) to one channel with only two possible values (0 or 255). This representation is therefore not viable and not representative of the real world.

However, the relevance results obtained via Saimple demonstrate that the measurement bias initially present in the data has been completely removed. Effectively, spots do not appear anymore.

Use of data augmentation :

Some invisible image watermarks can be removed via operations used to do data augmentation (rotations, flips, etc.). We therefore try to use a "data-augmented" dataset to train our model. We then apply the following transformations to the dataset :
- we return images in a random way.
- we apply a rotation between 0° and 20° on the images.
- we shift the images in height and width over an interval of [ -0.2 × taille image ; 0.2 × taille image ].

The training of the network is longer because there is more data to process due to the data augmentation. After 15 epochs the model reaches an accuracy of 86% on the validation data ; the performance is better but not as good as with the basic data. This is due to the fact that the model is now closer to the reality. The previous model had a better accuracy but it was mistaking, influenced by a bias, having for example false positive cases. And, by analysing the relevance on Saimple, we notice that the measurement bias has been completely removed.

Nut relevance mask – data augmented dataset

The relevance of the model trained with data augmentation seems way more consistent. Saimple reveals the benefit of the data augmentation performed. This model is much more usable than the one trained on black and white images. On the other hand, the model has difficulty in classifying real images, due to another bias : the sample bias.

3. Conclusion:

The model that has learned with data augmentation seems to be the most interesting because it allows the measurement bias to be removed without losing information. On the other hand, these models are still subject to the problem of the sample bias. This issue will also be dealt by using Saimple in another use case.

In this use case Saimple allowed to detect a measurement bias and to ensure that the model was not subject to this bias anymore, by offering relevance analyses all along the training process.

It also allowed to check if the data augmentation was beneficial or not, by allowing to show the direct impact of the data augmentation on the model.

If you are interested in Saimple, want to know more about the use case or if you want to have access to a demo environment of Saimple:

Contact us : support@numalis.com

Annexes

picture credit : fabio (unsplash)