Skip to main content
Version: 🚧 Alpha 🚧

15. Testing Tutorial Summary

By Killian Trouillet


Testing the Trained Multi-Agent Policy

After training, we load the shared PPO model and run both foragers in the GAMA GUI.

Starting GAMA GUI

Open GAMA normally. The GUI server runs on port 1000 by default.

Loading the Shared Model

from train_forager_petz import PPOAgent

agent = PPOAgent(state_dim=15, action_dim=2)
agent.load("saved_models/ppo_forager.pth")

One model, shared by both foragers — just like training.

Running a Cooperative Episode

obs, _ = env.reset()
done = False
while not done:
actions = {}
for agent_id in env.agents:
action, _, _ = agent.select_action(
np.array(obs[agent_id], dtype=np.float32),
test=True # ← deterministic: use mean action
)
actions[agent_id] = action

obs, rewards, terminations, truncations, _ = env.step(actions)
done = not env.agents or all(terminations.values()) or all(truncations.values())
time.sleep(0.1)

What You Should See

In the GAMA GUI:

  • Blue forager_0 navigates with its LIDAR cone around the left obstacles.
  • Teal forager_1 follows a slightly different path due to its starting position.
  • The first to arrive at the green food turns orange and freezes.
  • When both are orange, the episode ends: cooperative success.

Running the Test Script

cd models/petz
python test_forager_petz.py

Expected Console Output

=======================================================
Smart ForagerMARL Test (gama-pettingzoo GUI)
=======================================================
Model loaded (shared by both foragers)

Running 1 cooperative test episodes...

Episode 1/1:COOPERATIVE SUCCESS! | Steps: 61
forager_0: reward = 89.4
forager_1: reward = 87.1

=======================================================
Test Results Summary
=======================================================
Episodes : 1
Success Rate: 100%
Avg Steps : 61
forager_0 avg reward: 89.4
forager_1 avg reward: 87.1
=======================================================

Full Tutorial Summary

Part 1 – Internal RL (GAML only)

StepConcept introduced
1Grid world with grid species
2Forager agent with random movement
3Reward function and episodes
4Q-Table as map<string, float>
5Q-Learning / Bellman equation
6Charts, heatmap, test mode

Part 2 – Deep RL with Gymnasium

StepConcept introduced
7Continuous world, architecture overview
8GymAgent bridge species
9LIDAR ray-cast sensors, movement, reward shaping
10Headless training with custom PyTorch PPO
11GUI testing, deterministic evaluation

Part 3 – Multi-Agent RL with PettingZoo

StepConcept introduced
12PettingZoo Parallel API, PetzAgent bridge, cooperative rewards
13Multi-agent GAML model, as_map, team obs, episode-end signal
14Parameter-Shared PPO, batch inference, GamaParallelEnv directly
15GUI testing, series recap

Key Concepts Across All 3 Parts

ConceptPart 1Part 2Part 3
World10×10 grid100×100 continuousSame
Agents112
Actions4 discrete2D continuous [dx, dy]Same
SensorsGrid position8 LIDAR rays8 LIDAR + teammate pos
AlgorithmQ-LearningPPOParameter-Shared PPO
BridgeNoneGymAgentPetzAgent
LibraryNonegama-gymnasiumgama-pettingzoo
RL FrameworkNonePyTorch (custom PPO)Same
TaskSolo foodSolo foodCooperative food

Key GAML Concepts Introduced in Part 3

PetzAgent, agents, possible_agents, observations, rewards, terminations, truncations, actions, update_data, as_map, all_match, contains_key, episode-end via agents <- []

Key Python Concepts Introduced in Part 3

GamaParallelEnv, env.agents, env.observation_space(agent_id), env.action_space(agent_id), env.reset()dict, env.step(actions_dict)dict, parameter sharing, batch inference, select_actions_batch(), per-agent RolloutBuffer


Key Files

FileDescription
models/petz/forager_petz.gamlGAMA model with PetzAgent bridge
models/petz/train_forager_petz.pyMARL training script (headless)
models/petz/test_forager_petz.pyTesting script (GUI visualization)