CPSC422 Spring 2007
Assignment 2

Due: 2:00pm, Thursday 1 February 2007.

The aim of this assignment is to learn about reinforcement learning. You are to write code for the simple game at:

http://www.cs.ubc.ca/spider/poole/demos/rl/sGame.html

You are free to use any of the code for that applet. You can also refer to and use any of the code for the reinforcement learning applets, e.g.,

http://www.cs.ubc.ca/spider/poole/demos/rl/q.html

Please read and post to the bulletin board in the course WebCT site for hints on how to modify the controllers. You can either do this assignment by yourself or with someone else (i.e., in pairs).

Question 1

Write a reinforcement-learning controller for the simple game (modifying only SGameController.java) that uses function approximation. You need to:

  1. select a number of reasonable features
  2. implement the function approximation
  3. show how the answer or speed vary as a function of any parameters
  4. test it both in the policy found in the long term, as well as how quickly it learns
If you are working by yourself, then you only need to do one set of features. If you are working in a pair, you need to compare three sets of features.

Question 2

In this question, you are to think about the effect of changing the alpha on your implementation. Compare the following values for alpha:
  1. 1/n
  2. 10/(9+n)
  3. fixed value of 0.2
  4. fixed value of 0.02
  5. use 0.2 for a while then change to 0.02
You should compare these for the cases:
  1. The values you want the expected value for are all 10
  2. The values alternate between 2 and 10
  3. The values are 5 for the first 100 steps then are 10 forever
  4. The values are 10 for every 10 steps and 0 otherwise
  5. The values are those generated by Q-learning
Write full sentences, and justify your answers with specific examples from running your code.

Question 3

[Note that this question is worth marks, so don't forget to do it.]
  1. For each question and each part in this assignment, say how long you spent on it. Was this reasonable? What did you learn?
  2. If there was more than one person in your group, say what each person did.
  3. Tell us every person you discussed this with (including the TA and fellow students), and every external source (e.g., web site, research paper, book) you referenced to do this assignment.

David Poole