Assignment 1: First Steps

Deadline: October 17th, 9am

In this assignment, you will implement and train your first deep model on a well-known image classification task. You will also familiarize yourself with the Tensorflow library we will be using for this course.

General Assignment Notes

You will need to do some extra reading for these assignments. Sorry, but there is no way around this.
Assignments are posed in a very open-ended manner. Often you only “need” to complete a rather basic task. However, you will get far more out of this class by going beyond these basics. Some suggestions for further explorations are usually contained in the assignment description. Ideally though, you should really see what interests you and explore those directions further. Share and discuss any interesting findings on Mattermost!
Please don’t stop reading at “Bonus”; see above. Don’t be intimidated by all the text; pick something that interests you/lies within your capabilities and spend some time on it.

Setting Up

Install Python (3.x is recommended) if you haven’t done so, and install Tensorflow. This should be as simple as writing

pip install tensorflow

in your console. This will install the CPU version; for now, there is no need to bother with the GPU version since you will usually use your own machine only for development and small tests.

Download the raw MNIST files from Yann LeCun’s website. MNIST is a collection of handwritten digits and a popular (albeit by now trivialized) benchmark for image classification models. Download our conversion script. Unpack the data, put the script in the same folder and run it as

python conversions.py -c -n

This will create both csv tables and numpy arrays of the data (you don’t need the csv’s but the arrays are created from them). If you want to, you can also append the flag -p to create folders with the actual pictures to get an impression of the data (this will take a bit longer).

Note: For some reason, the MNIST file names seem to differ slightly between operating systems. You might need to adjust conversions.py accordingly if you get some “file not found” error.

Google Colab

Google Colab is a platform to facilitate teaching of machine learning/deep learning. There are tutorials available on-site. Essentially, it is a Jupyter notebook environment with GPU-supported Tensorflow available.

If you want to, you can develop your assignments within this environment. See below for some notes. However, we ask that all of you hand in their assignments via Colab. To hand in your assignment, paste your code/write-up into a notebook, create a link via the “SHARE” function and send it via email to jens.johannsmeier@ovgu.de. Subject line should be “IDL 2018 Assignment X”, where X = assignment number, and the mail should include your group members’ names and matriculation numbers. Of course only one group member needs to submit, however all members need to be able to present by themselves in the exercise sessions. Notebooks support Markup, so you can also write some text about what your code does, your observations etc.

Running code on Colab should be fairly straightforward; there are tutorials available in case you are not familiar with notebooks. There are just some caveats:

You will need to get external code (like datasets.py) in there somehow. One option would be to simply copy and paste the code into the notebook so that you have it locally available. Another would be to run a cell with from google.colab import files; files.upload() and choose the corresponding file, this will load it “into the runtime” to allow you to import datasets.py. Unfortunately you will need to redo this every time the runtime is restarted.
You will need to make the data available as well. Since the above method results in temporary files, the best option seems to be to upload them to Google Drive and use from google.colab import drive; drive.mount('/content/drive'). You might need to “authenticate” which can be a bit fiddly. After you succeed, you have your drive files available like a “normal” file system. If you find better ways to do this (or the above point), please share them with the class!
The “persistent” nature of notebooks can be at odds with the graph execution model of Tensorflow. If you run into problems, use tf.reset_default_graph() before re-running code that is related to building the graph. This should fix most problems.

Tensorflow Basics

NOTE: The Tensorflow docs went through significant changes recently. In particular, most introductory articles were changed from using low-level interfaces to high-level ones. We believe it’s better to start with low-level interfaces that force you to program every step of building/training a model yourself. This way, you actually need to understand what is happening in the code. High-level interfaces do a lot of “magic” under the hood. We will proceed to these interfaces after you learn the basics. This is why some of the links below lead to old versions of the TF docs that still have the low-level tutorials. Unfortunately, it seems that the ability to access old tutorials straightforwardly over the website has been removed, forcing us to link you to Github instead. Note that these tutorials sometimes have broken formatting (formulas especially). Sorry!

Get started with Tensorflow. There are many tutorials on diverse topics on the website, as well as an API documentation.

Tensorflow Concepts: You need to understand tensors and variables, what a computational graph is, and how to train a model. You may ignore the section on tf.estimator for now.
Basic MNIST Tutorial: A logistic regression “walkthrough” both in terms of concepts and code. You will of course be tempted to just copy this code; make sure you understand what each line does. Note that Tensorflow comes with an MNIST dataset already, but we recommend that you download and process the data yourself (see above) and use this simple dataset class. This way, you get used to processing/reading your own datasets.
The Programmer’s Guide has more in-depth articles on many Tensorflow concepts. Right now you could read the ones on Tensors and Variables and maybe the one on Graphs and Sessions.

Play around with the example code snippets. Change them around and see if you can predict what’s going to happen. Make sure you understand what you’re doing.

Building A Deep Model

If you followed the tutorial linked above, you have already built a linear classification model (softmax regression). Next, turn this into a deep model by adding a hidden layer between inputs and outputs. There you go! You have created a Multilayer Perceptron. Hint: Initializing variables to 0 will not work for multilayer peceptrons. You need to initialize values randomly instead (e.g. random_uniform between -0.1 and 0.1). Why do you think this is the case?

Next, you should explore this model: Experiment with different hidden layer sizes, activation functions or weight initializations. See if you can make any observations on how changing these parameters affects the model’s performance. Going to extremes can be very instructive here.

Also, reflect on the Tensorflow interface: If you followed the tutorials you were asked to, you have been using a very low-level approach to defining models as well as their training and evaluation. Which of these parts do you think should be wrapped in higher-level interfaces? Do you feel like you are forced to provide any redundant information when defining your model? Any features you are missing so far?

Tensorflow Execution Model

It is extremely important that you understand the graph-based execution model of Tensorflow. As a rule of thumb, tf.anything builds the graph on a symbolic level, which should only be done once, after which the graph is run repeatedly to produce results. To make sure you understand this, below you will find some “problematic” code snippets. Analyze what is going wrong with these snippets and propose ways to fix them.

Later in the class, we will look at eager execution, which is more akin to how things would work e.g. in numpy or Pytorch.

Bonus

Feel free to explore Tensorflow and MNIST more. For example, this tutorial gives a more complete coverage of topics such as saving a trained model and using it to make predictions.

There are also numerous ways to explore your model some more. For one, you could add more hidden layers and see how this affects the model. You could also try your hand at some basic visualization and model inspection: For example, visualize some of the images your model classifies incorrectly. Can you find out why your model has trouble with these?

Finally, think about the semantics of your model(s). Can you describe what a specific activation value (in the output as well as in the hidden layer) “means”? Can you do this for the weights that were learned during training? You should start thinking about this for the logistic regression model (having no hidden layers) and then proceed to your MLP. Take note of how much more difficult it becomes to reason about your network as it gets deeper. Visualization is extremely useful here!

You may also have noticed that MNIST isn’t a particularly interesting dataset – even very simple models can reach very high accuracy and there isn’t much “going on” in the images. Luckily, Zalando Research has developed Fashion MNIST. This is a more interesting dataset with the exact same structure as MNIST, meaning you can use it without changing anything about your code. You can download the data and use conversions.py with it, or you can follow this to use it instead of the built-in Tensorflow MNIST. You can attempt pretty much all of the above suggestions for this dataset as well.