Assignment 4: Deep Belief Networks

Discussion: May 18th

Having successfully implemented RBMs, we will continue our homage to Deep Learning history by implementing a Deep Belief Network.

Training DBNs

This is actually not too difficult if you have a proper RBM implementation. You will simply need to iterate the RBM training procedure over the layers (check the Deep Learning book, chapter 20.3): Start by training an RBM as usual (since we are still looking at binary variables, you might want to stick with MNIST). When this is finished, add a second hidden layer and train it, treating the first hidden layer as “visible” in this case. When this is finished, add a third hidden layer etc. Try going up to 1000 layers.

As always, there are a few caveats.

When all layers have finished training, you can try sampling from the DBN to see the results. Again, see chapter 20.3 for the procedure. Basically, you just run Gibbs sampling on the upper two layers and then run a single ancestral sampling step down to the visible units.

Unsupervised Pretraining

Using generative models for sampling is nice and all, but there are many more possible applications. A classical use case are classification tasks where little labeled data is available, but unlabeled data is plentiful. We can simulate such a situation by simply taking a subset of our labeled data and pretend that the rest isn’t there (or rather, doesn’t have labels). Try this: