Well I knew the agent was going to be more complex than the academy and the brain to set up, but what I didn’t realise is how much back and forth I’d be doing to fine tune the agent. I’ll be honest, even as I write this I’ve still not got an end result I’m happy with, but I’m learning, so thats what counts!
Anyway, rather than drop chunks of code into the blog which can get a bit confusing, I’ve uploaded the project to my Github account. I’ve tried to keep the comments up to date to walk through whats happening, but theres a lot in there, so I figured I’d do a breakdown of the agent script in the blog, and cross link to the actual script in the repo.
- First up, this script extends the Abstract class of Agent, that I pulled in from the core ML scripts project from Unity.
- Under Awake() I find the brain and academy. Since there will only be one of each of these in this project, I just used a FindObjectOfType.
- Next I override the InitilizeAgent() function from the parent Agent class, this still calls the base InitilizeAgent() inside of itself, and then initilizes several other bits and pieces. This is similar to how the Start() function is normally used.
- ReachedTheGoal is a basic reward script. When the agent reaches a goal and triggers the Done() state, it will get its reward.
- MoveAgent(float vectorAction) is a function that takes the number value generated by the Vector Action in the brain (since we are using the Type of Discrete, and a size of 4, I know this will only ever return numbers 0 to 3) and turns it into an action via the switch statement.
- AgentAction(float vectorAction, string textAction) is another override that inherits from the Agent parent, this is where I call the MoveAgent() function that I mentioned above, as well as creating a negative reward to encourage the agent to finish quickly.
- Lastly, the AgentReset() override tells the agent what it should reset once it has entered its Done() state.
With the script done, I attached it to a capsule collider that I rigged with a Rigidbody, and I did a rotation Freeze on the X and Z axis, this then allowed me to adjust the additional agent parameters in the inspector.
The agent setup is pretty straight forward, the Brain game object that I created as a child of Academy is dropped straight into the Brain field.
For the camera I added a camera to the capsule GameObject and then added that same camera to the camera field in the Inspector.
The Max Step is how many steps the agent will attempt to go through before it classes itself as Done, I’ve set this to 5000 to start with. I’ve also checked the Reset on Done option since I want the agent to restart once its reached its step count.
Decision Frequency is after how many engine steps will the agent ask the brain to make another decision, this is usually between 3 and 10, so I’ve gone for a middle of the road 5 to start with. Both this and the max step may well be adjusted as I go along to fine tune the training.
Lastly I set the goal, ground and spawn points manually in the inspector. I’m tempted to refactor these to automatically find the relevant components in the maze area, as this would allow me to run mutliple agents side by side by simply cloning the arena itself, however this can come later once I’ve actually managed to successfully train the agent.