COMP9444 Neural Networks and Deep Learning

Quiz 7 (Reinforcement Learning)

This is an optional quiz to test your understanding of the material from Week 7.
  1. Explain the difference between the following paradigms, in terms of what is presented to the agent, and what the agent aims to do:

  2. Describe the elements (sets and functions) that are needed to give a formal description of a reinforcement learning environment. What is the difference between a deterministic environment and a stochastic environment?

  3. Name three different models of optimality in reinforcement learning, and give a formula for calculating each one.

  4. What is the definition of:
    1. the optimal policy
    2. the value function
    3. the Q-function?

  5. Assuming a stochastic environment, discount factor γ and learning rate of η, write the equation for
    1. Temporal Difference learning TD(0)
    2. Q-Learning
    Remember to define any symbols you use.