Discussion: May 11th
In this assignment, we will be implementing a Binary RBM. This requires some low-level programming rather than just sticking a bunch of layers together.
Start off by implementing algorithm 18.1 from the deep learning book.
gibbs_update
step. Recall
that RBMs allow for efficient gibbs sampling by first updating all hidden
units at once, and then all visible ones.tfp.distributions.Bernoulli
should be helpful. Use the
conditional distributions from chapter 20.2. Docs can be found
here. Do not use other
submodules such as MCMC-related functions!GradientTape
to compute the gradient update by treating
“negative phase minus positive phase” as a loss function to be minimized.
However, the “ideal” value for this loss will actually be 0 (data and model
distributions are identical) despite the function in principle being able to
take on any value. If your loss decreases to arbitrarily large negative values
there is likely something wrong with your training (although it’s normal for
negative values to appear in the beginning).Once you have the basic algorithm going, you might want to test it first. Since we are working with binary RBMs, MNIST seems like the best option here. You may “binarize” the data by rounding all values to 0 or 1, however since MNIST is already almost binary this will likely not make a large difference. Experiment with different numbers of hidden units and burn-in steps, and generate some samples of the trained models for subjective evaluation.
Next, you should improve on the basic procedure.
Test your algorithms once again and compare the results (as well as the speed at which you achieve them) to the basic algorithm.
Feel free to try other training methods such as pseudolikelihood or score matching.