7. Introduction The Continuous World
By Killian Trouillet
Why Deep RL?
In Part 1, we built a Smart Forager using Tabular Q-Learning directly in GAML. While it worked on a 10×10 grid, it has two fundamental limitations:
- Discrete only: The Q-Table requires a finite set of states. A forager that moves freely in continuous space has infinite states — the table would be infinitely large.
- No generalization: The agent learns nothing about positions it has never visited. If it learned that
(3, 5)is good, it knows nothing about(3.1, 5.2).
Deep RL solves both problems by replacing the Q-Table with a neural network that can generalize across similar states and handle continuous spaces.
What changes in Part 2
| Aspect | Part 1 (Native GAML) | Part 2 (Gymnasium) |
|---|---|---|
| World | 10×10 grid | 100×100 continuous space |
| Movement | 4 discrete directions (up/down/left/right) | 2D velocity vector (dx, dy) |
| Sensors | Grid position (x, y) | 8 ray-cast obstacle sensors + position + food direction |
| Learning | GAML map<string, float> Q-Table | Python neural network |
| Algorithm | Q-Learning (Bellman equation) | PPO (Proximal Policy Optimization) |
| Training | Inside GAMA | Headless GAMA + Python |
| Testing | Press ▶️ in GAMA | Python script → GAMA GUI |
Architecture Overview
┌──────────────────────────────┐ ┌──────────────────────────────────┐
│ GAMA (Headless) │ │ Python │
│ │ │ │
│ forager_gym.gaml │◄───►│ train_forager.py │
│ ├─ Continuous world │ │ ├─ Stable Baselines3 (PPO) │
│ ├─ Obstacles (geometry) │ │ ├─ Neural Network policy │
│ ├─ 8 ray-cast sensors │ │ └─ Gymnasium interface │
│ └─ GymAgent bridge species │ │ │
│ │ │ test_forager.py │
│ WebSocket (port 1001) │ │ └─ Load model + visualize │
└──────────────────────────────┘ └──────────────────────────────────┘
Prerequisites
Install the required Python packages:
pip install gama-gymnasium stable-baselines3
Make sure you have GAMA 2024.09+ installed with headless server support.
The Continuous World
Instead of using a grid species with discrete cells, we set up a plain continuous world of 100×100 units.
Global Setup
global {
// Same layout as native 10×10, scaled to 100×100 continuous world
// Each native cell = 10×10 units. Cell (x,y) center = {x*10+5, y*10+5}
float world_size <- 100.0;
point food_location <- {95.0, 95.0}; // Native cell (9,9)
float food_radius <- 5.0;
// Same 6 obstacle cells as native: {2,2},{3,2},{2,3},{6,4},{7,4},{7,5}
list<geometry> obstacles <- [];
int max_steps <- 300;
int current_step <- 0;
int gama_server_port <- 0;
init {
// Each obstacle is a 10×10 square at the cell center
obstacles << square(10) at_location {25.0, 25.0}; // cell (2,2)
obstacles << square(10) at_location {35.0, 25.0}; // cell (3,2)
obstacles << square(10) at_location {25.0, 35.0}; // cell (2,3)
obstacles << square(10) at_location {65.0, 45.0}; // cell (6,4)
obstacles << square(10) at_location {75.0, 45.0}; // cell (7,4)
obstacles << square(10) at_location {75.0, 55.0}; // cell (7,5)
create forager number: 1 {
location <- {5.0, 5.0}; // Native cell (0,0)
}
}
}
Key Differences from Part 1
- No
gridspecies: Instead ofgrid world_cell width: 10 height: 10, the world is a plain 100×100 area. - Obstacles are geometries: We use
square(10)(matching one grid cell) placed at each cell's center, instead of settingis_obstacleon grid cells. - Same layout: The 6 obstacle cells
{2,2},{3,2},{2,3},{6,4},{7,4},{7,5}are preserved exactly. - Food is a point:
food_location <- {95.0, 95.0}(center of native cell 9,9) withfood_radius <- 5.0. gama_server_port: This variable is required by thegama-gymnasiumbridge.
Visualization
The experiment block draws the world using graphics layers:
experiment gym_env {
parameter "communication_port" var: gama_server_port;
output {
display "Continuous World" type: 2d {
graphics "background" {
draw rectangle(world_size, world_size)
at: {world_size/2, world_size/2}
color: rgb(240, 240, 240);
}
graphics "obstacles" {
loop obs over: obstacles {
draw obs color: rgb(80, 80, 80);
}
}
graphics "food" {
draw circle(food_radius) at: food_location color: rgb(50, 180, 50);
}
species forager;
}
}
}
Note: The
parameter "communication_port" var: gama_server_port;line is mandatory — it tellsgama-gymnasiumwhich port to use for the WebSocket bridge.
Complete Model
The model at the end of this step is the skeleton of forager_gym.gaml: the continuous world, visualization, and a basic placeholder forager species. The full implementation is built step by step in Steps 8 and 9.
/**
* Name: SmartForagerGym - Step 7: The Continuous World
* Author: Killian Trouillet
* Description: Continuous 100x100 world matching the Part 1 layout.
* Scaffold for the GymAgent bridge (Steps 8–9).
* Tags: reinforcement-learning, gymnasium, continuous, tutorial
*/
model SmartForagerGym
global {
float world_size <- 100.0;
point food_location <- {95.0, 95.0}; // native cell (9,9)
float food_radius <- 5.0;
list<geometry> obstacles <- [];
int max_steps <- 300;
int current_step <- 0;
int gama_server_port <- 0;
init {
// Same 6 obstacles as Part 1, scaled to 100x100
obstacles << square(10) at_location {25.0, 25.0}; // cell (2,2)
obstacles << square(10) at_location {35.0, 25.0}; // cell (3,2)
obstacles << square(10) at_location {25.0, 35.0}; // cell (2,3)
obstacles << square(10) at_location {65.0, 45.0}; // cell (6,4)
obstacles << square(10) at_location {75.0, 45.0}; // cell (7,4)
obstacles << square(10) at_location {75.0, 55.0}; // cell (7,5)
create forager number: 1 {
location <- {5.0, 5.0};
}
}
}
species forager {
// To be fully implemented in Steps 8 and 9
aspect default {
draw circle(3) color: #blue;
}
}
experiment gym_env {
parameter "communication_port" var: gama_server_port;
output {
display "Continuous World" type: 2d {
graphics "background" {
draw rectangle(world_size, world_size)
at: {world_size / 2, world_size / 2}
color: rgb(240, 240, 240);
}
graphics "obstacles" {
loop obs over: obstacles {
draw obs color: rgb(80, 80, 80);
}
}
graphics "food" {
draw circle(food_radius) at: food_location color: rgb(50, 180, 50);
}
species forager;
}
}
}