Skip to main content
Reinforcement LearningInteractive Demo

Can an RL Agent Get Stuck in an Infinite Loop?

Watch how a poorly designed policy traps an agent forever, and discover how cycle detection can rescue it.

Quick start

Get up and running with the demo in minutes

1

Open the live demo

Navigate to jhonzacipa.github.io/rl-cycle-demo in your browser. No installation required.
2

Run the simulations

Click ▶ Ejecutar on both panels to see the difference between a stuck agent and one with cycle detection.The left panel shows an agent with no protection — it gets stuck repeating the same action. The right panel shows the same agent with cycle detection enabled, which escapes the loop and reaches the goal.
3

Explore the controls

Use the → Paso button to step through the simulation one action at a time, or adjust the speed slider to control the animation speed (50ms–800ms).
4

Clone the repository

To run the demo locally or explore the source code:
git clone https://github.com/JhonZacipa/rl-cycle-demo.git
cd rl-cycle-demo
open index.html

Explore the concepts

Learn about reinforcement learning policies, cycle detection, and prevention strategies

Infinite loops

Understand how RL agents can get trapped in infinite loops

Cycle detection

Learn how to detect and break out of cycles

Policy design

Design better policies that avoid common pitfalls

Demo features

Interactive visualization with real-time state tracking

3×3 grid world

Navigate a simple environment with walls and goals

Interactive controls

Run, step, and reset the simulation at your own pace

Side-by-side comparison

Compare scenarios with and without cycle detection

Real-time metrics

Track steps, rewards, and cycle escapes as they happen

Prevention strategies

Explore common solutions for avoiding infinite loops in RL systems

Max steps

The simplest safeguard — terminate after N steps

Cycle detection

Track state visits and force exploration on repetition

ε-greedy exploration

Take random actions with probability ε

Other techniques

Step penalties, curiosity, and discount factors

Ready to explore?

Dive into the interactive demo and see cycle detection in action, or explore the source code to understand how it works.