Commit Graph

11 Commits

Author SHA1 Message Date
Jan Löwenstrom 740289ee2b add constant for default reward 2020-04-02 14:01:37 +02:00
Jan Löwenstrom eca0d8db4d create Dino Sampling state 2020-03-26 19:22:50 +01:00
Jan Löwenstrom ee1d62842d split Antworld into episodic and continuous task
- add new simple state for jumping dino, to see if convergence is guarenteed with with state representation
- changed reward structure for ant game
2020-03-15 16:58:53 +01:00
Jan Löwenstrom 4641f50b79 add results for convergence for advanced dino jumping 2020-03-05 13:17:54 +01:00
Jan Löwenstrom e67f40ad65 split DinoWorld between simple and advanced example
# Conflicts:
#	src/main/java/example/JumpingDino.java
2020-03-05 12:06:41 +01:00
Jan Löwenstrom 0e4f52a48e first epsilon decaying method 2020-02-27 15:29:15 +01:00
Jan Löwenstrom cff1a4e531 add isJumping info to dinoState 2020-02-26 17:14:28 +01:00
Jan Löwenstrom 77898f4e5a add TD algorithms and started adopting to continous tasks
- add Q-Learning and SARSA
- more config variables
2020-02-17 13:56:55 +01:00
Jan Löwenstrom b7d991cc92 render 5 frames for every RL step
- temp. repainting JComponent in env.step()
2020-01-01 18:05:59 +01:00
Jan Löwenstrom 195722e98f enhance save/load feature and change thread handling
- saving monte carlo did not include returnSum and returnCount, so it the state would be wrong after loading. Learning, EpisodicLearning and MonteCarlo classes are all overriding custom save and load methods, calling super() each time but including fields that are necessary to replace on runtime.
- moved generic episodic behaviour from monteCarlo to abstract top level class
- using AtomicInteger for episodesToLearn
- moved learning-Thread-handling from controller to model. Learning got one extra Leaning thread.
- add feature to use custom speed and distance for dino world obstacles
2019-12-29 01:12:11 +01:00
Jan Löwenstrom 5a4e380faf add dino jumping environment, deterministic/reproducable behaviour and save-and-load feature
- add feature to save and load learning progress (Q-Table) and current episode count
- episode end is now purely decided by environment instead of monte carlo algo capping it on 10 actions
- using linkedHashMap on all locations to ensure deterministic behaviour
- fixed major RNG issue to reproduce algorithmic behaviour
- clearing rewardHistory, to only save the last 10k rewards
- added google dino jump environment
2019-12-22 23:33:56 +01:00