Version: 🚧 Alpha 🚧

7. Introduction The Continuous World

By Killian Trouillet

Why Deep RL?

In Part 1, we built a Smart Forager using Tabular Q-Learning directly in GAML. While it worked on a 10×10 grid, it has two fundamental limitations:

Discrete only: The Q-Table requires a finite set of states. A forager that moves freely in continuous space has infinite states — the table would be infinitely large.
No generalization: The agent learns nothing about positions it has never visited. If it learned that (3, 5) is good, it knows nothing about (3.1, 5.2).

Deep RL solves both problems by replacing the Q-Table with a neural network that can generalize across similar states and handle continuous spaces.

What changes in Part 2

Aspect	Part 1 (Native GAML)	Part 2 (Gymnasium)
World	10×10 grid	100×100 continuous space
Movement	4 discrete directions (up/down/left/right)	2D velocity vector `(dx, dy)`
Sensors	Grid position `(x, y)`	8 ray-cast obstacle sensors + position + food direction
Learning	GAML `map<string, float>` Q-Table	Python neural network
Algorithm	Q-Learning (Bellman equation)	PPO (Proximal Policy Optimization)
Training	Inside GAMA	Headless GAMA + Python
Testing	Press ▶️ in GAMA	Python script → GAMA GUI

Architecture Overview

┌──────────────────────────────┐     ┌──────────────────────────────────┐
│      GAMA (Headless)         │     │          Python                  │
│                              │     │                                  │
│  forager_gym.gaml            │◄───►│  train_forager.py                │
│  ├─ Continuous world         │     │  ├─ Custom PyTorch PPO           │
│  ├─ Obstacles (geometry)     │     │  ├─ ActorCritic neural network   │
│  ├─ 8 ray-cast sensors       │     │  └─ Gymnasium interface          │
│  └─ GymAgent bridge species  │     │                                  │
│                              │     │  test_forager.py                 │
│     WebSocket (port 1001)    │     │  └─ Load model + visualize       │
└──────────────────────────────┘     └──────────────────────────────────┘

Prerequisites

Install the required Python packages:

pip install gama-gymnasium torch

Make sure you have GAMA 2024.09+ installed with headless server support.

The Continuous World

Instead of using a grid species with discrete cells, we set up a plain continuous world of 100×100 units.

Global Setup

global {
    // Same layout as native 10×10, scaled to 100×100 continuous world
    // Each native cell = 10×10 units. Cell (x,y) center = {x*10+5, y*10+5}
    float world_size <- 100.0;
    point food_location <- {95.0, 95.0};   // Native cell (9,9)
    float food_radius <- 5.0;
    
    // Same 6 obstacle cells as native: {2,2},{3,2},{2,3},{6,4},{7,4},{7,5}
    list<geometry> obstacles <- [];
    
    int max_steps <- 300;
    int current_step <- 0;
    
    int gama_server_port <- 0;
    
    init {
        // Each obstacle is a 10×10 square at the cell center
        obstacles << square(10) at_location {25.0, 25.0};  // cell (2,2)
        obstacles << square(10) at_location {35.0, 25.0};  // cell (3,2)
        obstacles << square(10) at_location {25.0, 35.0};  // cell (2,3)
        obstacles << square(10) at_location {65.0, 45.0};  // cell (6,4)
        obstacles << square(10) at_location {75.0, 45.0};  // cell (7,4)
        obstacles << square(10) at_location {75.0, 55.0};  // cell (7,5)
        
        create forager number: 1 {
            location <- {5.0, 5.0};   // Native cell (0,0)
        }
    }
}

Key Differences from Part 1

No grid species: Instead of grid world_cell width: 10 height: 10, the world is a plain 100×100 area.
Obstacles are geometries: We use square(10) (matching one grid cell) placed at each cell's center, instead of setting is_obstacle on grid cells.
Same layout: The 6 obstacle cells {2,2},{3,2},{2,3},{6,4},{7,4},{7,5} are preserved exactly.
Food is a point: food_location <- {95.0, 95.0} (center of native cell 9,9) with food_radius <- 5.0.
gama_server_port: This variable is required by the gama-gymnasium bridge.

Visualization

The experiment block draws the world using graphics layers:

experiment gym_env {
    parameter "communication_port" var: gama_server_port;
    
    output {
        display "Continuous World" type: 2d {
            graphics "background" {
                draw rectangle(world_size, world_size) 
                     at: {world_size/2, world_size/2} 
                     color: rgb(240, 240, 240);
            }
            graphics "obstacles" {
                loop obs over: obstacles {
                    draw obs color: rgb(80, 80, 80);
                }
            }
            graphics "food" {
                draw circle(food_radius) at: food_location color: rgb(50, 180, 50);
            }
            species forager;
        }
    }
}

Note: The parameter "communication_port" var: gama_server_port; line is mandatory — it tells gama-gymnasium which port to use for the WebSocket bridge.

Complete Model

The model at the end of this step is the skeleton of forager_gym.gaml: the continuous world, visualization, and a basic placeholder forager species. The full implementation is built step by step in Steps 8 and 9.

/**
 * Name: SmartForagerGym - Step 7: The Continuous World
 * Author: Killian Trouillet
 * Description: Continuous 100x100 world matching the Part 1 layout.
 *              Scaffold for the GymAgent bridge (Steps 8–9).
 * Tags: reinforcement-learning, gymnasium, continuous, tutorial
 */

model SmartForagerGym

global {
	float world_size <- 100.0;
	point food_location <- {95.0, 95.0};   // native cell (9,9)
	float food_radius <- 5.0;

	list<geometry> obstacles <- [];

	int max_steps <- 300;
	int current_step <- 0;

	int gama_server_port <- 0;

	init {
		// Same 6 obstacles as Part 1, scaled to 100x100
		obstacles << square(10) at_location {25.0, 25.0};  // cell (2,2)
		obstacles << square(10) at_location {35.0, 25.0};  // cell (3,2)
		obstacles << square(10) at_location {25.0, 35.0};  // cell (2,3)
		obstacles << square(10) at_location {65.0, 45.0};  // cell (6,4)
		obstacles << square(10) at_location {75.0, 45.0};  // cell (7,4)
		obstacles << square(10) at_location {75.0, 55.0};  // cell (7,5)

		create forager number: 1 {
			location <- {5.0, 5.0};
		}
	}
}

species forager {
	// To be fully implemented in Steps 8 and 9
	aspect default {
		draw circle(3) color: #blue;
	}
}

experiment gym_env {
	parameter "communication_port" var: gama_server_port;
	output {
		display "Continuous World" type: 2d {
			graphics "background" {
				draw rectangle(world_size, world_size)
					at: {world_size / 2, world_size / 2}
					color: rgb(240, 240, 240);
			}
			graphics "obstacles" {
				loop obs over: obstacles {
					draw obs color: rgb(80, 80, 80);
				}
			}
			graphics "food" {
				draw circle(food_radius) at: food_location color: rgb(50, 180, 50);
			}
			species forager;
		}
	}
}

Why Deep RL?​

What changes in Part 2​

Architecture Overview​

Prerequisites​

The Continuous World​

Global Setup​

Key Differences from Part 1​

Visualization​

Complete Model​