Wednesday, February 15, 2012

Feedback 15-02-2012

Work

  • Both the environment and  agent visualization of RL Glue indicate the best sample and best regression tree leaf (indicated as a 1D or 2D area by their ranges) by the color green
  • All leafs in the regression tree remember the best sample seen which is picked in case the best greedy action is chosen
  • Implemented memorization of the samples adjustable by three parameters
    • Maximum number of samples to store at any time in a leaf
    • Maximum number of samples to pass to best child in case of a split
    • Maximum number of samples to pass to worst child in case of a split
  • I have looked into the Scala code and the algorithm is now able to converge to the global optimum within approximately 200 samples (as seen in the visualization);
    • A problem I noticed is that for the Six Hump Camel Back a lot of times a split was introduced at the very edge of the state space, leaving a group of only one sample. A minimum amount of samples a child should at least have in case of a split is introduced.
    • I don't know which significance test was used for the experiments in the paper, but in the Scala code the F-test is commented out and a T-test is used. I therefore also implemented a T-test so I could compare between both
    • Furthermore I noticed that the F-distribution table they were using showed incorrect values. I have no clue how they got to those values. I experimented with and have the option to use both value sets; their (0.1 and 0.001) tables and Apache's F-test method (generating the "correct" values according to the literature; for any significance level).
    • The most important change was the adaptive UCT constant. This makes a huge difference. 
Planning
  • Holiday!
  • Implement the possibility to specify and read a certain properties file (instead of only a default file in the root folder, as of now)
  • Implement (in Matlab) reading and visualizing the output from experiments (i.e. the generated .csv files; see scala code received from Kurt for reference)
  • Run an experiment to compare with the results from the paper.

No comments:

Post a Comment