Assignment 9: Introspection

Deadline: December 13th, 9am

In this assignment, we will implement gradient-based model analysis both for creating saliency maps (local) and for feature visualization (global). You can also take inspiration from the DeepDream tutorial. It is recommended that you work on image data as this makes visual inspection of the results simple and intuitive.

You are welcome to use pre-trained ImageNet models from the tf.keras.applications module. The tutorial linked above uses an Inception model, for example. Note that these models generally expect rather large inputs, for example 224x224 pixels. However, you can generally take arbitrary images and resize them (e.g. tf.image.resize) to the necessary dimensions. Also note that these models generally require specific pre-processing of the input; all modules have their own functions for this, see the API.

You can of course also train your own models on CIFAR or something similar. Smaller data and models usually makes everything much faster. :) You can also prototype on CIFAR and then try to generalize to bigger images/models once everything works.

Gradient-based saliency map (sensitivity analysis)

Run a batch of inputs through the trained model. Wrap this in a GradientTape where you watch the input batch (batch size can be 1 if you’d like to just produce a single saliency map). and compute the gradient for a particular logit or its softmax output with respect to the input. This tells us how a change in each input pixel would affect the class output. This already gives you a batch of gradient-based saliency maps! Plot the saliency map next to the original image or superimpose it. Do the saliency maps seem to make sense? How would you interpret them?

Note that you will get very different results when using logits or softmax probabilities!

Further notes:

Saliency maps can be especially interesting for wrongly classified inputs. Here, you could compute saliency maps for either the correct class, or the predicted one. How do the two differ?

Activation Maximization

Extend the code from the previous part to create an optimal input for a particular class.

Note: You need to take care that the optimized inputs actually stay valid images throughout the process, e.g. by clipping to [0, 1] after each gradient step, or by using a sigmoid function to produce the images.

Does the resulting input look natural? How do the inputs change when applying many steps of optimization? How do the optimal inputs differ when initializing the optimization with random noise instead of real examples? Can you see differences between optimizing a logit or a (log) softmax probability?

Bonus: Apply regularization strategies to make the optimal input more natural-looking. You can also optimize for hidden features of the network (instead of outputs) assuming you can “extract” them from the model you built. Distill has an article that can provide some inspiration.

Bonus: Unmasking “Clever Hans” Predictors

Creating saliency maps, and then not doing anything with them, might seem slightly pointless. If you have time, you can try the following experiment:

Submission

Include code for creating saliency maps as well as activation maximization. Show some comparisons/experiments for various inputs, concerning e.g. the different ways to represent saliency maps (smoothed or not, thresholded or not, etc.). Also show examples of activation maximization for various classes (e.g. CIFAR cars vs. dogs vs. ships etc…). In case you applied regularization techniques, also document how those influenced the results!