Deadline: November 8th, 9am
Note: This section is not necessary for submission, but it is highly recommended that you read it, as it will significantly speed up your models.
So far, we have been using so-called “eager execution” exclusively: Commands are
run as they are defined, i.e. writing y = tf.matmul(X, w)
actually executes
the matrix multiplication.
In Tensorflow 1.x, things used to be different: Lines like the above would only define the computation graph but not do any actual computation. This would be done later in dedicated “sessions” that execute the graph. Later, eager execution was added as an alternative way of writing programs and is now the default, mainly because it is much more intuitive/allows for a more natural workflow when designing/testing models.
Graph execution has one big advantage: It is very efficient because entire models (or even training loops) can be executed in low-level C/CUDA code without ever going “back up” to Python (which is slow). As such, TF 2.0 still retains the possibility to run stuff in graph mode if you so wish – let’s have a look!
As expected, there is a tutorial on the TF website, as well as this one which goes intro extreme depth on all the subtleties. The basic gist is:
@tf.function
to “activate” graph
execution for this function.print
, these will not be traced so
the statement will only be called during the tracing run itself. If you want to
print things like tensor values, use tf.print
instead. Basically, traced TF
functions only do “tensor stuff”, not general “Python stuff”.Go back to some of your pevious models and sprinkle some tf.function
annotations
in there. You might need to refactor slightly – you need to actually wrap things
into a function!
tf.function
. If you can get this to work on one of your
previous models and actually get a speedup, you get a cookie. :)
Note: In recent TF versions, another speedup factor was added: You can specify
so-called “just-in-time (JIT) compilation” for your graphs. Simply use @tf.function(jit_compile=True)
.
This will increase compilation time, but further reduce execution time as well as memory usage!
However, there are some edge cases where JIT compilation doesn’t work, so if you ever
run into strange errors, try turning it off.
Previously, we saw how to build neural networks in a purely sequential manner – each layer receives one input and produces one output that serves as input to the next layer. There are many architectures that do not follow this simple scheme. You might ask yourself how this can be done in Keras. One answer is via the so-called functional API. There is an in-depth guide here. Reading just the intro should be enough for a basic grasp on how to use it, but of course you can read more if you wish.
Next, use the functional API to implement a ResNet. This is an incredibly important architecture; residual connections are part of pretty much every state-of-the-art model in any domain.
You do not need to follow the exact same architecture from the paper, in fact you will probably want to make it smaller for efficiency reasons. Just make sure you have one or more “residual blocks” with multiple layers each. You can also leave out batch normalization (this will be treated later in the class) as well as “bottleneck layers” (1x1 convolutions) if you want. Still, it can be a good exercise to see how best to structure your code to easily build more complex and arbitrarily deep models. Let’s say, how would you build a network with 20-100 layers with as little code duplication as possible?
Bonus: Can you implement ResNet with the Sequential API? You might want to look at how to implement custom layers (shorter version here)…
tf.function
. You can
also do this for non-ResNet models. How does the impact depend on the size
of the models?The part is just here for completeness/reference, to show some additional TensorBoard functionalities. Check it out if you want.
You can display the computation graphs Tensorflow uses internally in TensorBoard. This can be useful for debugging purposes as well as to get an impression what is going on “under the hood” in your models. More importantly, this can be combined with profiling that lets you see how much time/memory specific parts of your model take.
To look at computation graphs, you need to trace computations explicitly.
See the last part of this guide
for how to trace tf.function
-annotated computations. Note: It seems like you
have to do the trace the first time the function is called (e.g. on the first
training step).