Assignment 10: Conditional Generation & Guided Diffusion

Discussion: June 27th
Deadline: June 26th, 20:00

For this assignment, you will need a working diffusion model implementation. Either

The latter is recommended as it is self-contained and extendable.

The Gitlab code should work as-is! If it doesn’t, please ask as it is most likely not your fault! Test it before changing anything!

Conditional Generative Models

A conditional model is a model of p(x|c) instead of just p(x), where c is some conditional information. A very straightforward kind is a class-conditional model, where you can supply a class as “input” and get a generation of that class. Let’s start by implementing one of those.

Train a conditional model and make sure that it’s generating appropriate samples given class inputs!

Guided Diffusion

Guided Diffusion has been shown to provide a quality-variety tradeoff that can be tuned as desired. There are two kinds of guidance: Classifier guidance and classifier-free guidance. The latter is more popular and easier to implement. The score changes from gradient(log(p(x|c))) to (1+w) * gradient(log(p(x|c))) - w * gradient(log(p(x))). Diffusion models don’t implement a score directly, but the common implementation is equivalent. So in the Langevin sampler, just change model(x,c) to (1 + w) * model(x,c) - w * model(x). w is a hyperparameter. Some special cases are:

We just have one problem: We trained a conditional model, but now we need an unconditional model as well. Luckily, we can get both at the same time.

A simple way do achieve this dropout could be like this (it’s basically a binary mask):

drop_prob = 0.1  # for example
# you may need to choose dtype=tf.float32 or dtype=tf.int32 depending on condiitoning
drop_dist = tfp.distributions.Bernoulli(probs=1-drop_prob, dtype=??)

conditioning = ... # e.g. class labels as one_hot or indices
drop_sample = drop_dist.sample(tf.shape(conditioning))
dropped_conditioning = drop_sample * conditioning

Otherwise, training doesn’t change from the normal conditional model.

Finally, as mentioned further above, adapt the Langevin sampler to provide both conditional and unconditional model outputs (just run the model twice, once with conditioning, once with no/dropped conditioning). Experiment with different values of w. Common values for good samples are in the region of 1-5 (again depending on the dataset, model etc.). Do you find that quality increases as opposed to w = 0 (no guidance)? What happens as you increase w further, say 10 or above?