This is an optional quiz to test your understanding of the material from Weeks 8 to 10.

Write out the steps in the REINFORCE algorithm, making sure to define any symbols you use.
In the context of Deep Q-Learning, explain the following:
1. Experience Replay
2. Double Q-Learning
What is the Energy function for these architectures:
1. Boltzmann Machine
2. Restricted Boltzmann Machine
Remember to define any variables you use.
The Variational Auto-Encoder is trained to maximize
E_{z ∼ q_φ(z | x⁽ⁱ⁾)} [log p_θ(x⁽ⁱ⁾ | z)] – D_KL(q_φ(z | x⁽ⁱ⁾) || p(z))
Briefly state what each of these two terms aims to achieve.
Generative Adversarial Networks make use of a two-player zero-sum game between a Generator G_θ and a Discriminator D_ψ, to compute
min_θ max_ψ (V(G_θ, D_ψ))
Give the formula for V(G_θ, D_ψ).
In the context of GANs, briefly explain what is meant by mode collapse, and list three different methods for avoiding it.