Skip to main content
Version: 🚧 Alpha 🚧

The Smart Forager β€” Reinforcement Learning in GAMA

By Killian Trouillet

Welcome to the comprehensive tutorial on Reinforcement Learning with the GAMA platform. You will build a forager agent that learns to navigate toward food while avoiding obstacles β€” from a simple grid world to a continuous environment trained with Deep RL.


Part 1: Internal RL (GAML only)​

Build a tabular Q-Learning agent entirely in GAML, step by step:

  1. Step 1: The Grid World β€” Create the 10Γ—10 environment with food and obstacles.
  2. Step 2: The Forager Agent β€” Define a simple agent that moves randomly.
  3. Step 3: Rewards and Episodes β€” Implement the reward system and simulation resets.
  4. Step 4: The Q-Table β€” Set up the agent's memory using map<string, float>.
  5. Step 5: Q-Learning Algorithm β€” Implement the Bellman equation and Ξ΅-greedy policy.
  6. Step 6: Visualization & Automatic Test β€” Add charts, heatmaps, and evaluate the learned policy.

Expected console output after training and testing


Part 2: Deep RL with Gymnasium β€” Continuous Forager​

In this part, we move from the grid world to a continuous environment and train a neural network using PPO via the gama-gymnasium Python bridge.

  1. Step 7: Introduction & The Continuous World β€” Why Deep RL? Architecture overview. Continuous world setup.
  2. Step 8: The GymAgent Bridge β€” The bridge species, spaces, and GAMA↔Python communication.
  3. Step 9: Sensors, Movement & Rewards β€” Ray-cast sensors, velocity actions, reward shaping. Complete GAML model.
  4. Step 10: Headless Training with PPO β€” Python script, PPO explained, training process.
  5. Step 11: Testing in GAMA GUI β€” Load and visualize the trained policy. Summary.