Skip to main content
Version: 🚧 Alpha 🚧

7. Introduction The Continuous World

By Killian Trouillet


Why Deep RL?

In Part 1, we built a Smart Forager using Tabular Q-Learning directly in GAML. While it worked on a 10×10 grid, it has two fundamental limitations:

  1. Discrete only: The Q-Table requires a finite set of states. A forager that moves freely in continuous space has infinite states — the table would be infinitely large.
  2. No generalization: The agent learns nothing about positions it has never visited. If it learned that (3, 5) is good, it knows nothing about (3.1, 5.2).

Deep RL solves both problems by replacing the Q-Table with a neural network that can generalize across similar states and handle continuous spaces.

What changes in Part 2

AspectPart 1 (Native GAML)Part 2 (Gymnasium)
World10×10 grid100×100 continuous space
Movement4 discrete directions (up/down/left/right)2D velocity vector (dx, dy)
SensorsGrid position (x, y)8 ray-cast obstacle sensors + position + food direction
LearningGAML map<string, float> Q-TablePython neural network
AlgorithmQ-Learning (Bellman equation)PPO (Proximal Policy Optimization)
TrainingInside GAMAHeadless GAMA + Python
TestingPress ▶️ in GAMAPython script → GAMA GUI

Architecture Overview

┌──────────────────────────────┐ ┌──────────────────────────────────┐
GAMA (Headless) │ │ Python
│ │ │ │
forager_gym.gaml │◄───►│ train_forager.py
│ ├─ Continuous world │ │ ├─ Stable Baselines3 (PPO)
│ ├─ Obstacles (geometry) │ │ ├─ Neural Network policy
│ ├─ 8 ray-cast sensors │ │ └─ Gymnasium interface
│ └─ GymAgent bridge species │ │ │
│ │ │ test_forager.py
WebSocket (port 1001) │ │ └─ Load model + visualize
└──────────────────────────────┘ └──────────────────────────────────┘

Prerequisites

Install the required Python packages:

pip install gama-gymnasium stable-baselines3

Make sure you have GAMA 2024.09+ installed with headless server support.


The Continuous World

Instead of using a grid species with discrete cells, we set up a plain continuous world of 100×100 units.

Global Setup

global {
// Same layout as native 10×10, scaled to 100×100 continuous world
// Each native cell = 10×10 units. Cell (x,y) center = {x*10+5, y*10+5}
float world_size <- 100.0;
point food_location <- {95.0, 95.0}; // Native cell (9,9)
float food_radius <- 5.0;

// Same 6 obstacle cells as native: {2,2},{3,2},{2,3},{6,4},{7,4},{7,5}
list<geometry> obstacles <- [];

int max_steps <- 300;
int current_step <- 0;

int gama_server_port <- 0;

init {
// Each obstacle is a 10×10 square at the cell center
obstacles << square(10) at_location {25.0, 25.0}; // cell (2,2)
obstacles << square(10) at_location {35.0, 25.0}; // cell (3,2)
obstacles << square(10) at_location {25.0, 35.0}; // cell (2,3)
obstacles << square(10) at_location {65.0, 45.0}; // cell (6,4)
obstacles << square(10) at_location {75.0, 45.0}; // cell (7,4)
obstacles << square(10) at_location {75.0, 55.0}; // cell (7,5)

create forager number: 1 {
location <- {5.0, 5.0}; // Native cell (0,0)
}
}
}

Key Differences from Part 1

  • No grid species: Instead of grid world_cell width: 10 height: 10, the world is a plain 100×100 area.
  • Obstacles are geometries: We use square(10) (matching one grid cell) placed at each cell's center, instead of setting is_obstacle on grid cells.
  • Same layout: The 6 obstacle cells {2,2},{3,2},{2,3},{6,4},{7,4},{7,5} are preserved exactly.
  • Food is a point: food_location <- {95.0, 95.0} (center of native cell 9,9) with food_radius <- 5.0.
  • gama_server_port: This variable is required by the gama-gymnasium bridge.

Visualization

The experiment block draws the world using graphics layers:

experiment gym_env {
parameter "communication_port" var: gama_server_port;

output {
display "Continuous World" type: 2d {
graphics "background" {
draw rectangle(world_size, world_size)
at: {world_size/2, world_size/2}
color: rgb(240, 240, 240);
}
graphics "obstacles" {
loop obs over: obstacles {
draw obs color: rgb(80, 80, 80);
}
}
graphics "food" {
draw circle(food_radius) at: food_location color: rgb(50, 180, 50);
}
species forager;
}
}
}

Note: The parameter "communication_port" var: gama_server_port; line is mandatory — it tells gama-gymnasium which port to use for the WebSocket bridge.


Complete Model

The model at the end of this step is the skeleton of forager_gym.gaml: the continuous world, visualization, and a basic placeholder forager species. The full implementation is built step by step in Steps 8 and 9.

/**
* Name: SmartForagerGym - Step 7: The Continuous World
* Author: Killian Trouillet
* Description: Continuous 100x100 world matching the Part 1 layout.
* Scaffold for the GymAgent bridge (Steps 8–9).
* Tags: reinforcement-learning, gymnasium, continuous, tutorial
*/

model SmartForagerGym

global {
float world_size <- 100.0;
point food_location <- {95.0, 95.0}; // native cell (9,9)
float food_radius <- 5.0;

list<geometry> obstacles <- [];

int max_steps <- 300;
int current_step <- 0;

int gama_server_port <- 0;

init {
// Same 6 obstacles as Part 1, scaled to 100x100
obstacles << square(10) at_location {25.0, 25.0}; // cell (2,2)
obstacles << square(10) at_location {35.0, 25.0}; // cell (3,2)
obstacles << square(10) at_location {25.0, 35.0}; // cell (2,3)
obstacles << square(10) at_location {65.0, 45.0}; // cell (6,4)
obstacles << square(10) at_location {75.0, 45.0}; // cell (7,4)
obstacles << square(10) at_location {75.0, 55.0}; // cell (7,5)

create forager number: 1 {
location <- {5.0, 5.0};
}
}
}

species forager {
// To be fully implemented in Steps 8 and 9
aspect default {
draw circle(3) color: #blue;
}
}

experiment gym_env {
parameter "communication_port" var: gama_server_port;
output {
display "Continuous World" type: 2d {
graphics "background" {
draw rectangle(world_size, world_size)
at: {world_size / 2, world_size / 2}
color: rgb(240, 240, 240);
}
graphics "obstacles" {
loop obs over: obstacles {
draw obs color: rgb(80, 80, 80);
}
}
graphics "food" {
draw circle(food_radius) at: food_location color: rgb(50, 180, 50);
}
species forager;
}
}
}