Learning Rate

Discount Factor

Action Randomization

How does it work?

Available actions are UP, DOWN, LEFT and RIGHT, simulating a user interaction with the game.
Reward ( +1 )
A reward is given when the snake grabs the fruit .
Penalty ( -1 )
The penalty occurs whenever the game resets, that is, the snake hits its tail or a wall.

If anything else happens, there's a minor negative reward ( -0.1 ). That's ideal to minimize the path taken to catch the fruit and makes the training process faster.

How to teach the Snake:

Learning Rate: How aggressive the AI will learn to play (close to 0 will be too slow, while close to 1 will simply replace the old learned value with the new one). Higher is not necessarily better

Discount Factor: Importance between immediate rewards and future rewards

Action Randomization: Percentage of time a random action will be executed instead of the desired action

So choose your configurations and hit "Teach the Snake" button!
Stop the training at any point to view your own AI Snake!

Have Fun!