- only setting the seed of RNG once at the beginning and not reseeding it afterwards. Deep copying
the initial AntWorld to use as blueprint for resetting the world instead of reseeding and creating pesudo random again. Reseeding the RNG has influence action selecting to always
choose the same trajectory.
- instance of is used to determine if policy has epsilon or not and the view will adopt to this, only showing epsilon slider if policy has epsilon
- fixed bug regarding wrong generation of hashCode. hashCodes needs to be equal across equal objects. Compute hashCode on final states once and return this value instead of computing it every time .hashCode() gets called.
-