- only setting the seed of RNG once at the beginning and not reseeding it afterwards. Deep copying the initial AntWorld to use as blueprint for resetting the world instead of reseeding and creating pesudo random again. Reseeding the RNG has influence action selecting to always choose the same trajectory. - instance of is used to determine if policy has epsilon or not and the view will adopt to this, only showing epsilon slider if policy has epsilon |
||
---|---|---|
.idea | ||
gradle/wrapper | ||
src | ||
.gitignore | ||
build.gradle | ||
gradlew | ||
gradlew.bat | ||
settings.gradle |