Commit Graph

8 Commits

Author SHA1 Message Date
Jan Löwenstrom 7d3d097599 add opening dialog to select all learning settings 2020-04-07 11:03:17 +02:00
Jan Löwenstrom 9d1f8dfd46 apply code improvements suggested by intelliJ 2020-04-05 14:44:48 +02:00
Jan Löwenstrom cffec63dc6 apply threading changes to master branch and clean up for tag version
- no testing or epsilon testing stuff
2020-03-05 11:49:51 +01:00
Jan Löwenstrom f4f1f7bd37 add QTableFrame and clickable states that display a gui
- remove org.javaTuple in favour of org.apache.common for tuples and circleQueue
- remove ViewListener from non-GUI Controller
- stateActionTable saves the last 10 states that changed. They will get displayed in QTable Frame
in JTextAreas
2020-01-01 23:54:18 +01:00
Jan Löwenstrom b1246f62cc add features to gui to control learning and moving learning listener interface to controller
- Add metric to display episodes per second
- view not implementing learning listener anymore, controller does. Controller is controlling all view actions based upon learning events. Reacts to view events via viewListener
- add executor service for learning task
- using instance of to distinguish between episodic learning and td learning
- add feature to trigger more episodes
- add checkboxes for smoothing graph, displaying last 100 rewards only and drawing environment
- remove history panel from antworld gui
2019-12-22 17:06:54 +01:00
Jan Löwenstrom 34e7e3fdd6 distinguish learning and episodic learning, enable fast-learning without drawing every step to reduce lag
- repainting every step on no time delay will certainly freeze the app, so "fast-learning" will disable it, only refreshing current episode label
- Added new abstract class "Episodic Learning". Maybe just use an interface instead?! Important because TD learning is not episodic, needs another way to represent the rewards received (maybe mean of last X rewards or sth)
- Opening two JFrames, one with learning infos and one with environment
2019-12-21 00:23:09 +01:00
Jan Löwenstrom 7db5a2af3b add fix RNG, add extended interface EpsilonPolicy and move rewardHistory to model instead of view
- only setting the seed of RNG once at the beginning and not reseeding it afterwards. Deep copying
the initial AntWorld to use as blueprint for resetting the world instead of reseeding and creating pesudo random again. Reseeding the RNG has influence action selecting to always
choose the same trajectory.
- instance of is used to determine if policy has epsilon or not and the view will adopt to this, only showing epsilon slider if policy has epsilon
2019-12-20 16:51:09 +01:00
Jan Löwenstrom e0160ca1df adopt MVC pattern and add real time graph interface 2019-12-18 16:48:24 +01:00