Numalis

- Calibrating your training dataset rapidly - Defining what are the kinds of perturbations and risks expected on your system - Optimizing your data augmentation effort

1- Saimple helps : During the specification phase

How can Saimple tool can help you during the specialisation phase in 3 steps :

1- ASSEMBLE YOUR TRAINING SET, TRAIN ON IT, SEE WHAT THE NETWORK IS LEARNING

Starting from an on-the-shelf and a raw training set is every AI engineer's first step. Before customization the engineer needs to select a promising architecture and see how it performs on the dataset. However training sets can be incomplete and biased which can alter the performance assessment done. Usually the engineer only sees the performance score of the network on the dataset, for example using a confusion matrix (see Fig. 1). At first an engineer is blind to any performance cause, as the artificial neural network does not explain much what is impacting its performance. Each guess the engineer is doing needs to be tested and the network trained again, this iterative process can take a very long time in some cases.

Here is how Saimple can help your engineer adapt the neural network training dataset.

Using its unique explainability process for neural network Saimple can extract for every tested image why it is correctly or wrongly classified. In the example of digital image processing below, the engineer knows that this image is wrongly classified as Truck despite being annotated as Street. With Saimple he can check and see that some features of Truck images are recognized on this image. Knowing that he checks correctly classified images of Trucks to discover that the main feature that is recognized of Trucks seems to be the brown boxes that are loaded in, which is not what he intended. Therefore he updates the database by forcing the boxes to appear less on the images of Trucks. After retraining, the image is correctly classified and the features motivating this classification are the correct ones.

By doing so your engineer discovers which class is more likely to bring confusion in the classification process. He also learns why some classes are stepping onto others. He can therefore take appropriate action to correct the dataset to quickly improve the performance of your network and thus of the image analysis.

(Fig. 1 - A confusion matrix, which is good to measure performance but not very helpful to interpret the results)

Figure 2:

(Fig. 2 - A street image wrongly classified as "Truck")

Figure 3:

(Fig. 3 - Features of Truck mostly recognized by the AI system on the original image)

Figure 4:

(Fig. 4 - Example of image of Truck in training set with the features learned by the AI system.)

2- IMPROVE THE ARCHITECTURE TO ENSURE BETTER COMPUTER VISION’S PERFORMANCE

Once the training set is stable enough you can now try to optimize the neural network architecture to improve the overall picture analysis performance over the test dataset. Normally the optimization is done manually using intuition and experience. Each optimization is a trial and error approach and much time can be invested without any certainty of any significant improvement.

Here is how Saimple can help your engineer find a more performant architecture. After training the neuronal network it appears that the computer vision’s system performances are not always good on certain images of the class Helicopter. The engineers suspect that in the internal process of the network some layers fail to intercept the feature of the landing gears. Usually it has to change iteratively the layer, modifying their size, their number until finding an adequate solution. Some use automated processes to find this solution, but each time the training and the evaluation are necessary, which can still be long and tedious. With Saimple your engineer can visualize at each layer how the process is handling the input information (Fig. 1). By doing this the engineer can discover that the features of the landing gears he is interested in are lost at the second hidden layer. Therefore any attempt to preserve these features should be done either on this layer or at least near this one.

By doing this the engineer can discover that the features of the landing gears he is interested in are lost at the second hidden layer. Therefore any attempt to preserve these features should be done either on this layer or at least near this one.

Using this process your engineer discovers quickly where to change its network to improve its performance.

(Fig. 1 - A view of what is discovered at each layer by the neural network)

3- PERFORM AN OPTIMIZED DATA AUGMENTATION

In order to improve a dataset engineers can use data augmentation techniques. While being well understood they are sometimes used despite having the need. They can largely improve the robustness of a network however they will increase rapidly the cost of training.

Applying whenever it is necessary and only to the extent needed are essential to avoid costly delay. Saimple can help control the need for data augmentation techniques. For further information please refer to the page on Data Augmentation.

Crédit image: Christopher Gower (Unsplash)

Back to Image processing

Image processing